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SUMMARY 

A  powerful  new  framework  is  presented  for  the  analysis  of  distributed 
detection  networks,  supported  by  a  compact  notation  for  the  description  of 
complicated  networks.  The  new  methods  are  applicable  to  problems  with  any 
number  of  threats,  any  number  of  messages,  and  any  number  of  available  actions. 
As  illustrations,  the  Sensor  Calculus  is  applied  to  reveal  some  interesting  features 
of  the  case  of  two-fold  threats,  messages  and  possible  actions.  These  features 
include  the  occurrence  of  spontaneous  symmetry  breaking  with  identical  sensors, 
and  the  sub-optimality  of  deterministic  tuning  for  fusion  systems. 


Accession  For 

ilS  GRA&I 
DTIC  TAB 
Unannounced 
Justif lcation_ 


COPY 

nspected 


By - 

Distribution/ 

Availability  Codes 
lAvai'l  and/or 
Dist  Special 


DS-069\ONR\CHAP-ALL.HXP  Rev  #3  Page  1 


m 

*8® 

m 

ll 


K! 


V%' 


FOREWORD 


This  report  covers  the  first  year  of  a  three-year  study  of  the  application  of  optimal  control 

theory  to  the  design  of  distributed  sensor  systems.  The  work  is  focused  on  the  key  links 

between  detection,  discrimination  and  decision.  Detection  is  an  engineering/physics  problem. 

Discrimination  is  affected  by  softer  considerations  such  as  estimates  of  prior  probabilities, 

which  depend  on  intelligence  as  well  as  engineering  information.  Decision  involves  even  softer 

estimates  of  the  costs  and  values  associated  with  various  possible  damage  or  loss.  Thus  the 

<) 

Detection-Discrimination-Decision  (D  )  problem  spans  a  range  from  hard  facts  to  volatile 
speculation. 

Our  work  concentrates  on  building  a  rigorous  framework  in  which  hard  data  serve  to  define 
an  operating  characteristic,  and  softer  data  are  used  to  define  the  optimal  tuning  of  a  sensor 

O 

system.  Improvements  in  detection  and  discrimination  always  carry  price  tags.  In  the  D 
framework  the  benefits  (in  damage  control)  may  be  weighed  against  those  costs,  for  best 
system  management. 

The  principal  results  of  the  first  year’s  work  are: 

(1)  Formalization  of  the  concept  of  a  detector  operating  characteristic  (doc),  in  a  form  correct 

for  generalization  to  any  number  of  possible  threats,  any  number  of  available  responses, 
and  any  channel  message  carrying  capacity. 

(2)  Development  of  a  powerful  and  compact  notation  suitable  for  describing  any  network  of 

sensors  and  for  determining  its  doc  and,  as  appropriate,  its  optimal  tuning.  (The  Sensor 
Calculus). 

Using  these  tools  we  have  established  a  number  of  significant  specific  results  for  the  simplest 
case  (one  threat,  one  response,  and  binary  message  schemes.)  Among  these  results  are: 

(1)  Spontaneous  symmetry  breaking.  It  is  often  found  that  when  two  identical  sensors  are  used 
to  inform  a  fusion  (decision)  center,  their  optimal  tunings  are  the  same.  We  have 
established,  by  specific  examples,  that  this  is  not,  in  general,  true.  It  can  be  the  case 
that  symmetrical  tuning  is  less  effective  than  a  suitable  symmetry- breaking  choice  of 
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tuning.  This  significantly  increases  the  complexity  of  finding  optimal  solutions,  but 
permits  improvement  in  overall  performance  of  the  detection  network. 

(2)  Discontinuity  of  optimal  tuning  in  fusion.  We  have  established  that,  quite  generally,  in 

fusion  systems,  the  optimal  tuning  may  be  discontinuous  as  a  function  of  the  softer 
parameters  such  as  prior  probabilities  and  estimated  cost.  This  has  serious  implications 
for  optimal  system  design  because  the  soft  parameters  are  subject  to  significant  change 
after  a  system  has  been  constructed.  Every  effort  must  be  made  to  ensure  that  the  likely 
range  of  variability  does  not  include  tuning  discontinuities. 

(3)  When  a  sensor  communicates  over  a  limited  channel  there  is  a  loss  of  information.  If  one 

sensor  is  better  than  another,  should  the  better  one  send  or  receive  the  message  over  a 
limited  channel?  Using  the  sensor  calculus  techniques  we  have  established  that  there  is 
no  general  rule  governing  this.  In  some  situations  one  alternative  is  better,  and  in  other 
situations  the  other  is  better. 

(4)  The  best  achievable  deterministic  architecture  will  be  significantly  suboptimal  in  some 

resource-constrained  situations.  This  has  serious  implications  for  the  allocation  of 
resources  among  interceptors  and  sensors. 

In  all  of  this  analysis  the  ability  to  move  easily  from  a  discrete  to  a  continuous  formulation 

has  enormously  clarified  our  understanding  of  the  problem.  We  are  firmly  convinced  that 

reliance  on  analytical  approximations  is  an  artificial  and  dangerous  restriction  in  the  study  of 
o 

D  problems. 

Our  plans  for  the  second  and  third  years  of  the  project  are  to  continue  this  line  of  research 
by:  (1)  developing  algorithms  that  will  accomplish  the  basic  operations  of  the  sensor  calculus 
as  efficiently  as  possible;  (2)  extending  the  results  to  the  case  of  more  than  two  possible  states 
of  nature  (as,  for  example,  when  there  may  be  a  variety  of  decoy  threats);  (3)  extending  the 
results  to  the  case  of  more  than  two  possible  actions  and/or  messages  between  sensors;  (4) 
extending  the  formalism  to  deal  with  “call-back”  systems  in  which  some  message  combinations 
may  result  in  a  polling  of  the  sensors;  (5)  exploration  of  the  implications  of  the  maximum 
entropy  principle  for  scheduling  such  “call-backs”.  The  overall  goal  of  the  research  is  to  bring 
the  task  of  cc.  ibining  sensor  characteristics  to  a  highly  automated  state,  so  that  the 
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consideration  of  alternative  architectures  will  be  reduced  to  “cook-book”  calculations  using  the 
calculus  of  sensors. 

Acknowledgements:  It  is  a  pleasure  to  acknowledge  stimulating  conversations  with  Dr.  Keith 
Taggart,  SDIO/CMO,  Jason  Goodffiend,  Joshua  Scharf  and  R.  Barry  Thomas  at  System 
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1.  Introduction  and  Notation. 

This  report  is  part  of  an  ongoing  effort  to  resolve  the  problem  of  detection, 
discrimination  and  decision  (D^  problem)  into  design  of  the  Detection-Discrimination  network 
on  the  one  hand,  and  discussion  of  the  Decision  aspects  on  the  other.  We  find  that  the  natural 
link  between  these  areas  is  given  by  a  powerful  construct  termed  the  doc  (for  detector 
operating  characteristic.)  The  doc  generalizes  the  notion  of  the  ROC  (Receiver  Operating 
Characteristic),  which  is  the  boundary  of  the  doc  in  the  familiar  cases.  The  presentation  is  in 
two  main  parts:  the  first  deals  with  the  Detection-Discrimination  Network;  the  second  deals 
with  the  decision  problem.  We  find  that  the  doc  plays  a  centred  role  by  (i)  describing  the 
overall  characteristics  of  the  network  for  use  in  the  decision  problem  and  by  (ii)  providing  the 
necessary  and  sufficient  information  about  each  sensor  to  support  solution  of  the  problem  of 
optimal  network  design.  A  review  of  related  literature  is  included  as  Appendix  B. 

1.2  Notation. 

In  Part  I  we  show  how  a  sensor  S  can  be  fully  characterized  by  a  set  of  points  SD(S) 
called  the  doc  of  S.  We  show  how  the  doc  is  built  up  from  the  signal  set  Y  using  the  response 
functions  /j(y),  ...  fgiv)  corresponding  to  some  exclusive  and  exhaustive  list  of  alternate 
hypotheses  about  the  world,  fc=l,....,2T.  We  show  that  the  familiar  ROC  is,  in  some  sense,  the 
boundary  of  the  doc.  We  introduce  the  useful  concepts  of  the  full  product  of  sensors  S  and  Sf, 
S®S'  and  of  the  M-fold  restriction  of  a  sensor,  %(M)S.  This  latter  concept  is  useful  because 
communication  within  sensor  networks  is  constrained  by  the  capacities  of  communication 
channels. 

1.3  Definitions 

Although  our  presentation  will  not  be  highly  formal,  we  state  here  the  definitions  of 
the  key  concepts  of  the  sensor  calculus. 

A  Sensor  (S)  consists  of  a  signal  set  Y,  which  may  be  discrete  or  continuous,  and  a 
collection  of  non-negative  conditional  probability  measures  defined  on  Y,  df^,...,dfg, 
corresponding  to  the  possible  states  of  nature. 
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The  detector  operating  characteristic  (doc),  2)(S)  is  a  set  of  points  in  an  H-dimensional 
linear  vector  space,  consisting  of  all  points  of  the  form  /^(Y(t)),...,/j(Y(<))  where  Y(t)  is  any 
measurable  subset  of  Y  and  /^(Y(<))  is  the  sum  or  integral  of  the  measure  over  the  set  Y(t). 
The  doc  3)  always  lies  within  the  closed  unit  hypercube  in  the  positive  orthant  determined  by 
the  origin  and  the  point  £=(1,...,1). 

The  boundary  of  the  doc  of  S,  S(S)  is  defined  as  tn  1  set  of  extreme  points  of  3)(S).  A 
point  Pe3)  is  an  extreme  point  if  there  exists  a  separating  hyperplane,  {r:  n-r=  c}  such  that 
n -P=c,  and  n*x>c  for  all  r£$(S). 

An  M-foId  restriction  with  tuning  t,  %(M,t)S  is  defined  in  terms  of  its  doc.  The  tuning 
t  defines  a  partition  of  Y  into  M  sets  {Y(m)}  m_i  The  doc  3)(91>(M,t)S)  is  the  discrete 

set  consisting  of  all  the  points  /(Y(m)).  This  corresponds  to  using  the  sensor  S  to  select  one 
from  a  set  of  M  options,  which  may  be  actions  or  messages. 

1.4  Examples 

To  illustrate  the  notation  we  consider  the  networks  shown  in  Figure  1.  The 
corresponding  expressions  in  the  calculus  of  sensors  are: 

Figure  la.  %(2){%(2)S1®%(2)S2} 

Figure  lb.  91,(2) js3®'31,(2){^(2)S1®^(2)S2}} 

Figure  lc.  9fc(2)[S3®9l,(2){S2®9K2)S1}] 

Figure  Id. 

*(2) 

The  interior  restrictions  91,(2)  represent  a  limitation  of  the  communication  channels  to 
two-fold  signals.  The  final  overall  restriction  91,(2)  represents  the  fact  that  there  are  only  two 
courses  of  action  available.  The  notation  is  easily  generalized  to  admit  other  capacities.  Note 
that  this  description  of  a  sensor  or  network,  and  the  concept  of  a  doc  (like  the  concept  of  an 
ROC)  does  not  specify  a  particular  tuning  of  the  sensor.  Similarly,  the  restriction  operator 
represents  the  whole  range  of  possible  choices  for  the  restriction. 
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Figure  la.  An  example  of  “fusion”  structure.  Corresponding  to  the  expression  %(2)S.  there  is  a 
box  representing  sensor  Sj,  from  which  there  comes  an  arrow  representing  a  two-fold  message. 
The  wavy  line  represents  a  signal  y<Yj.  The  solid  box  represents  the  sensor  product  of  the  two 
restricted  sensors. 


1  5 

Figure  lb.  The  fusion  of  messages  from  sensors  l  and  2  is  combined  in  sensor  product  with  the 
full  information  from  sensor  3,  forming  a  scries  structure. 


Figure  lc.  A  pure  scries  structure. 


W 
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In  Part  II  we  discuss  how  the  optimal  decision  process  for  a  set  of  alternative  hypotheses  H  and 
a  set  of  possible  actions  A={1,2,...  A}  requires  only  the  doc  characterizing  the  network  as  a 
whole.  We  show  how  problems  with  “complete  knowledge”  of  the  cost  matrix  C(a,k)  and  of 
the  prior  probabilities  p^,...p^  are  solved  using  the  doc.  We  also  describe  the  “Neyman- 
Pearson”  problem  (NP),  in  which  C  and  p  are  not  needed,  and  show  that  it  is  solved  by  the 
boundary  of  the  doc. 


This  fact,  that  the  doc  solves  the  NP  and  all  possible  Bayesian  problems,  is  very 
important,  because  the  doc  2)  is  completely  determined  by  the  hardware.  It  is  a  concrete 
engineering  characterization  of  the  network.  The  costs,  C(a,A),  and  the  prior  probabilities 
Pl'—Pff  are  likely  to  be  much  softer.  They  do  not  originate  in  engineering  constraints,  and 
may  change  rapidly. 


of  this  paper.  . 


The  organization  of  this  paper  is  as  follows.  Section  2  contains  some  examples  of  the 
doc  3)(S),  the  full  product  ®  and  the  k-fold  restriction  %  for  cases  in  which  the  space  of 
signals,  Y,  is  discrete.  Section  3  gives  examples  for  continuous  signal  sets.  Section  4  presents 
fundamental  network  considerations  for  the  specific  case  5=2.  Section  5  gives  some  specific 
results  for  this  case,  including  am  example  of  spontaneous  symmetry  breaking,  an  example  of  a 
non-convex  doc,  a  counter-example  for  series  structure  and  the  optimization  procedures  for  any 
fixed  topology,  with  either  free  or  fixed  combinative  logic. 

Part  II  begins  with  Section  6,  which  covers  the  use  of  the  doc  to  solve  both  the 
Neyman- Pearson  and  the  Bayesian  problems.  Section  7  contains  a  discussion  of  the 
discontinuities  of  system  tuning  parameters  in  the  case  of  fusion,  and  the  continuity  of  the  best 
achievable  cost.  Section  8  shows  how  resource  constraints  modify  the  decision  process,  and  may 
lead  to  a  cost  (performance  gap)  when  only  deterministic  tunings  are  available.  Section  9 
illustrates  the  application  of  the  doc  and  its  boundary  to  the  problem  of  team  action. 

The  value  of  the  doc  lies  in  the  fact  that  it  permits  a  value-free  comparison  of 
alternative  architectures,  however  complicated. 
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2.  Some  discrete  examples  of  the  detector  operating  characteristic  2>(S). 

Quite  generally,  the  signal  set  and  conditional  probability  distributions  which  define  a 
sensor  can  be  described  by  a  fundamental  table  of  numbers. 

yeY  1  2  3  4  5  6  ... 

4=  i(»)  -1  -3 

4=2  (»)  -4  -°  ■  (1) 

4=#^  A  2 

The  columns  of  the  table  are  labeled  by  the  elements  of  the  signal  set  Y,  which  is  taken, 
throughout  this  section,  to  be  discrete.  Even  when  the  physical  reality  is  a  continuous  signal, 
the  practicalities  of  measurement  will  always  force  us  to  assign  the  observations  to  a  finite 
number  of  discrete  bins.  With  this  in  mind  we  will  often  refer  to  the  elements  of  the  signal  set 
as  “bins.”  The  elements  in  each  row  represent  the  conditional  probability  that  the  observed 
signal  will  have  the  indicated  value,  provided  that  the  state  of  nature  which  labels  the  row  is 
indeed  true.  The  elements  in  a  row  are  called  the  “values  of  the  response  function 
corresponding  ia  tfee  indicated  state  of  nature.”  The  row  sums  are  1,  and  the  column  sums 
have  no  particular  meaning.  We  will  restrict  our  examples  to  the  case  in  which  the  number  of 
possible  states  of  the  world  H=2.  We  bear  in  mind,  however,  that  the  common  usage  of  “0” 
and  “1”  as  the  labels  suggests  an  asymmetry  among  the  hypotheses  and  the  actions  which  does 
not  exist  in  general.  (Although  there  will  be  at  least  one  action  which  is  “best”  if  a  given  state 
of  nature  prevails,  the  remaining  actions  may  be  “wrong”  to  differing  degrees,  and  their 
ordering  will  change  according  to  which  state  of  nature  does  indeed  prevail.) 

2.1  Specific  Examples 

Consider  a  specific  concrete  example  S  given  by  the  fundamental  table: 
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For  any  sensor,  the  notion  of  “tuning”  amounts  to  specifying  the  circumstances  under 
which  a  particular  action  (or  signal,  if  the  sensor  is  imbedded  in  a  network)  will  be  chosen.  For 
example,  we  may  represent  the  set  of  signals  leading  to  the  action  “a=l”  as  Y(a=l),  which  is  a 
subset  of  Y.  The  corresponding  probabilities  to  act,  given  the  alternative  states  of  nature  “0” 
and  “1”  are  represented  in  the  table  of  bin  combinations: 


Y(a=l): 

W 

{1} 

{2} 

{3} 

{1.2} 

{1.3} 

(2,3} 

{1,2,3} 

®(S)  = 

h=l 

.0 

.6 

.3 

.1 

.9 

.7 

.4 

1.0 

h=0 

.0 

.1 

.3 

.6 

.4 

.7 

.9 

1.0 

(3) 

The  pair  of  conditional  probabilities  in  any  given  column  may  be  taken  as  the  coordinates  of  a 
point  in  a  two-dimensional  space.  The  dimensionality  of  the  space  is  given  not  by  the  number 
of  actions,  but  by  the  number  of  hypotheses  (H).  We  refer  to  this  set  of  points  in  an  abstract 
space  as  the  doc  2)(S).  It  contains  all  of  the  useful  information  in  the  table.  The  points  in  the 
doc  may  each  be  labeled  by  the  subsets  to  which  they  correspond. 

We  give  two  other  examples  to  solidify  the  concept.  Consider  first  the  “broken 
detector.”  A  broken  detector  always  gives  the  same  signal,  which  we  choose  to  be  “3”.  In 
talking  about  the  case  of  only  two  actions,  to  which  we  now  restrict  ourselves,  it  is  convenient 
to  describe  “a=l”  as  “act”  and  “a=2”  as  “do  nothing.” 

The  fundamental  table  of  the  sensor  is: 
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The  table  of  the  doc  becomes: 

Y(a=l):  {0}  {1}  {2}  {3}  {1,2}  {1,3}  {2,3}  {1,2,3} 

2)(S)=  h=l  .0  .0  .0  1.  .0  1.  1.  1. 


h=0  .0 


.0  1.  1.  1. 

.0  1.  1.  1. 


There  are  only  two  distinct  points  in  the  doc.  One  point  corresponds  to  all  tunings  in  which  the 
set  Y(a=l)  contains  the  element  “3”  of  the  signal  set.  With  this  tuning  we  will  “always  act.” 
The  other  point  corresponds  to  all  other  subsets  of  Y,  and  with  this  tuning  we  will  “never  act.” 

2.2  Sensor  Product  S&T 

Consider  a  second  detector  whose  bins  are  not  necessarily  the  same  as  those  of  S^,  with 
fundamental  table: 

5  6 

S2=  .8  .2  (6) 

.3  .7 

We  define  the  full  product  sensor,  Sj®S2  by  the  product  table.  It  represents  all  of  the 
information  that  can  be  given  by  the  two  sensors  together,  and  has  six  bins  which  may  be 
labeled  as  15,  16,  25,  26,  35  and  36.  Quite  generally,  the  rows  of  the  fundamental  table  will  be 
the  conditional  joint  probability  distributions.  If  the  two  sensors  are  (stochastically) 
independent  the  table  for  the  full  product  is  determined  by  the  two  individual  tables. 
Specifically  we  have: 


s3- 


15 

16 

25 

26 

35 

36 

.48 

.12 

.24 

.06 

.08 

.02 

.03 

.07 

.09 

.21 

.18 

.42 

In  what  follows  wg  will  assume  : 


DS-069\ONR\CHAP-ALL.HXP  Rev  #3  Page  13 


t.  Since  there  are  2x3  =  6 
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bins  in  the  product  detector,  there  are  2°  =64  points  in  the  table  of  bin  combinations  which 
defines  the  doc.  The  resulting  doc  S(Sj)  is  shown  in  Figure  2.  The  fundamental  table  of  the 
sensor  may  be  reconstructed  from  the  difference  vectors  formed  along  the  upper  boundary, 
which  are  suggested  by  the  light  line  in  Figure  2.  This  line  is,  in  fact,  the  ROC  for  this  system, 
except  that,  since  the  system  and  its  doc  are  discrete,  only  the  vertices  are  realizable  in  a 
deterministic  system. 

Note  that  the  doc  of  a  sensor  product  must  always  contain  the  doc  of  either  of  the 
factors  because  one  sensor  may  be  tuned  to  the  point  (1,1),  in  which  case  the  products  are  all 
possible  timings  of  the  other  sensor.  The  product  procedure  can  be  followed  in  the  case  of  a 
continuous  signal  set,  by  binning  to  any  desired  degree  of  approximation,  in  order  to  produce  a 
standard  discrete  representation  of  the  doc  of  a  continuous  system. 

2.3  Restriction  of  Sensors  ^.(M)S 

When  a  sensor  communicates  through  a  finite  network  channel  it  may  not  be  able  to 
pass  on  all  of  the  available  information.  It  must  code  the  observed  signal  ycY  into  some  M-fold 
message.  To  do  this  the  signal  set  Y  is  decomposed  into  a  union  of  non-overlapping  subsets 
Y(m=l),  Y(m=2),  ...Y(m=M).  Each  such  partition  represents  a  “tuning”  of  the  sensor.  In 
general  the  channel  capacity  (M)  is  less  than  the  total  number  of  bins.  For  example,  with 
M=2,  and  the  sensor  S^®S2  there  are  64  tunings,  corresponding  to  the  points  in  the  doc. 
Among  them  is  the  totally  uninformative  choice  Y(m=l)={25,  26),  which  lies  on  the  principal 
diagonal  of  the  unit  square  containing  the  doc.  There  are  also  5  maximally  informative 
possibilities: 

Y(m=l)={15},  {15,25},  {15,25,16},  {15,25,16,35},  and  {15,25,16,35,26}. 

Further  discussion  is  postponed  to  Part  II.  We  represent  any  one  of  the  possible  tunings  by  the 
general  expression  1R>(2)S.j.  For  example,  one  specific  tuning  is:  ®R,(2:{15,25})Sj,  whose  doc  is 
shown  in  Figure  3. 
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3.  Some  continuous  examples. 


We  have  already  remarked  that  it  is  practically  necessary  to  replace  a  quantity  that  is 
“in  principle”  continuous  by  a  finite  set  of  discrete  bins.  In  other  cases  it  is  convenient  to 
approximate  something  that  is  fundamentally  discrete  as  being  essentially  continuous.  We  now 
consider  the  extension  of  the  concepts  of  the  doc  SD(S),  the  full  product  0,  and  the  n-fold 
restriction  %(n)  to  cases  in  which  the  signal  set  V  is  continuous. 

3.1  The  case  of  exponential  response  functions. 

The  case  of  exponential  response  functions  is  particularly  tractable,  and  will  be  pursued  until 
its  simplicity  proves  to  be  an  embarrassment.  The  signal  set  Y=[0,oo].  The  response  functions 
are  /^(y)=e  y  and  /Q(y)  =  ne  n^.  The  set  of  points  in  the  doc  is  precisely  the  allowed  region  of 
our  previous  paper  [Blankenbecler  and  Kantor88].  Referring  to  the  previous  section,  we  see  that 
even  in  the  discrete  case  the  set  of  points  in  the  doc  quickly  becomes  very  dense.  We  may 
readily  find  the  boundary  (in  the  case  of  only  two  hypotheses)  by  ordering  the  points  of  Y  in 
decreasing  order  of  the  ratio  /^(y)//g(y)-  This  provides  a  parametric  representation  of  the 
boundary  of  the  doc,  which  is  sufficient  for  further  numerical  calculation. 

If  we  call  the  parameter  involved  “z,”  a  suitable  choice  is  given  by 
Y(m=l;z)  =  [l/z,oo].  Using  “F,D”  to  represent  points  on  the  boundary  we  have  at  once: 
oo 

F0(Y(m=:l;r))=n  J  e'nydy  =e  (8) 

1/z 

and  oo 

F1(Y(m=l;z))=  J  t'V4y=t'l/*.  (9) 

1/z 

In  this  case  a  simple  analytic  relation  describes  the  upper  boundary  of  the  doc: 

W,)  =  V  (*£i)  0°) 

We  may  represent  all  of  the  salient  features  of  this  problem  in  a  family  of  three  related 
graphs,  as  shown  in  Figure  4.  The  first  figure  shows  the  response  functions  on  any  convenient 
scale,  as  a  function  of  y,  or  a  transformed  label.  The  third  figure  shows  the  upper  boundary 
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93+  of  the  doc  $(.).  The  lower  boundary  is  determined  by  the  symmetry  of  the  doc  under  the 
transformation  F— *(1  —  F)  and  D— >(1  —  D),  the  relabeling  of  actions.  We  adopt  here  the 
convenient  notations: 


fW0(Y(*))  (11) 

D=Fl{  Y(z))  (12) 

corresponding  to  the  notion  that  /g  represents  a  “false  alarm,”  while  fy  represents  a  “true 
detection  event.”  The  middle  part  of  the  figure  shows  the  translation  of  any  particular 
operating  point  on  the  boundary  of  the  doc  into  a  corresponding  “trigger  region”  Y(z).  The 
parameter  z  itself  need  never  be  made  explicit. 

3.2  Equivalent  Sensors 

Corresponding  to  the  fact  that,  for  a  discrete  system,  permuting  the  columns  of  the 
fundamental  table  does  not  change  the  doc,  there  are  an  infinity  of  transformations  of  the 
response  functions  which  will  leave  the  doc  unchanged  in  the  continuous  signal  case.  A  simple 
example  is  given  by  the  Rayleigh  distributions: 

w)zz2,nwe~nv’2  (13) 

and 

/0(w)=2we't°2.  (14) 

The  transformation  y=to2;  dy=2wdw  shows  that  these  two  sets  of  response  functions,  the 
Rayleigh  and  the  exponential,  have  exactly  the  same  integrals  over  corresponding  regions  in 
their  respective  signal  sets,  and  hence  will  have  the  same  doc. 

There  is  very  little  difficulty  in  principle  in  extending  the  parametric  representation  to 
any  computable  forms  for  /g  and  fy.  A  more  complex  example,  on  the  signal  set  Y=[— oo,oo] 


/'(,)=i5fc[<l/3>>200)2/2(20)2  +<l/3)>250)J/2<20)2  +(.A>)«-W00>2/2(20)2  ] 
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/2(,)=4^[(1/3)^.0>w  +(1/3)a»»>2«<*>>2  +(1/3)>~>2/2«2  ] 

(16) 

The  doc  has  been  calculated  numerically  by  ordering  unit  bins  centered  at  y=  150, ...350,  as 
described  above.  It  is  shown  in  Figure  5.  Note  that  the  trigger  regions  may  be  quite  complex, 
because  the  response  functions  do  not  have  a  monotone  likelihood  property  with  respect  to  the 
label  or  variable  “y”. 

It  may  be  shown  that,  provided  the  ratios  of  the  response  function  do  not  vary  too 
rapidly,  the  doc  3)  corresponding  to  any  fundamental  table  whose  signal  space  is  continuous 
will  be  a  convex  set.  That  is,  there  will  not  be  any  isolated  extreme  points,  or  holes  within  the 
boundary  8.  This  property  is  useful  in  studying  the  fundamental  operations  of  the  sensor 
calculus. 
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3.3  Standard  Forms  for  Sensor  Tables 

We  have  mentioned  that  different  sensors  may  have  the  same  doc.  It  is  therefore  useful 
to  introduce  the  notion  of  a  standard  doc.  This  may  be  done  in  two  ways.  One  leads  to  a 
continuous  signal  set,  and  may  be  useful  for  conceptual  purposes.  The  other  leads  to  a  discrete 
signal  set,  which  is  essential  for  calculation.  We  suppose  that  the  upper  boundary  23^"  of  the 
doc  is  given  in  the  form  D{F).  The  derivative  tf{F)  may  be  shown  to  exist,  from  the  right,  for 
all  F<1. 

Continuous  Standard  Form: 


Y=[-D(0),1] 

(17) 

V')={ 

0 

1 

y<0 

0<j<1 

(18) 

and 

V»>={ 

1 

D'(y) 

-D{  0)<y<0 

0<y<l 

(19) 

Discrete  Standard  £ sum  (E.  points): 

Define  the  auxiliary  function: 

faux(9)  =  "*»»  F.  (20) 

Ir(F)<tan8 

For  n=l  to  N: 

Fn=faux(n*/(m-faux((»rl)ir/{2N)).  (21) 

Dn=D{faux(nr/(2N))}-D{faux((n.l)ir/(2N))}.  (22) 

This  construction  divides  the  continuous  interval  from  0  to  1  into  portions  over  which  the  slope 
of  the  boundary  of  the  doc  changes  by  a  fixed  amount,  r/2 N.  This  construction  makes  use  of 
the  convexity  of  the  doc  and,  hence,  the  fact  that  its  boundary  has  a  monotonically  changing 
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slope.  [Technically,  as  we  draw  the  doc,  the  upper  boundary  is  concave,  and  the  lower 
boundary  is  convex.] 

3.4  The  Full  Sensor  Product: 


As  a  simple  example  we  form  the  full  sensor  product  of  two  identical  sensors  with 
exponential  response.  This  is  the  same  as  having  all  the  information  from  two  such  sensors 
before  making  a  decision.  Or,  it  can  be  regarded  as  making  two  successive  (independent) 
observations  with  the  same  instrument,  before  reaching  a  decison. 


We  introduce  the  transparent  notation: 


S  = 


Y=[0,oo] 

d=;y 

f=niny 


(23) 


to  represent  the  sensor  with  Y=[0,oo]  and  with  the  response  functions  indicated.  We  see  that 
the  full  sensor  product  has  the  representation: 


S®S= 


[0,oo]x[0,oo] 

,,  s  -(*+y) 
d(x,y)  =  e 

.  x  „2 /"(*+») 

A*i y)  —  n  e 


(24) 


The  obvious  parametrization  for  determination  of  the  boundary  of  the  doc  is  t=x+  y.  The 
corresponding  trigger  regions  are  of  the  form  Y(t0)=[t0,oo].  The  element  of  integration 
becomes  tdt,  with  the  results: 

£(<)=(  1  +  <K‘  (25) 

F(<)=(1 +  "<)«'”*  (26) 

A  general  recursive  formula  is  given  by: 

^*v<)  =  ^lsl(i)'^(fc.l)!e  1,2,...  (27) 
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Figure  6.  The  full  sensor  product.  The  upper  boundary  of  the  doc,  3+  is  shown  for  the  two 
and  rhiee-fold  product  of  the  exponential  sensor  with  itself.  Note  that  there  are  diminishing 
returns  in  the  continued  improvement  that  repeated  measurement  represents. 
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±=1,2,... 


(28) 


with: 

^o(<)=f'o(1)=°-  (29) 

Although  we  generally  cannot  express  D(F)  in  closed  form,  there  is  no  difficulty  in  preparing 
graphs,  or  performing  further  calculations  on  the  basis  of  these  formulae  and  results.  Examples 
showing  the  upper  boundary  3J+  of  the  doc  for  the  two-fold  and  three-fold  sensor  product  for 
the  exponential  case  are  shown  in  Figure  6. 

It  is  reasonable  to  suppose  that  if  two  sensors  have  the  same  doc,  the  doc  of  their 
product  with  other  sensors  will  not  depend  on  the  specific  representation  |Y,<f(y) ,J[ y)|  that  is 
used.  The  reader  may  verify  this  by  repeating  the  preceeding  calculation,  replacing  either  or 
both  of  the  sensor  descriptions  by  the  Rayleigh  form. 
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4.  Fundamental  Network  Elements 

The  fundamental  operations  that  go  to  build  up  a  network  have  already  been  defined: 
the  communication  restriction  9fc(M)S  and  the  full  sensor  product  S®S.  There  are  some  basic 
topologies  which  it  is  instructive  to  examine  in  detail,  both  to  illustrate  the  calculational 
techniques,  and  to  sharpen  our  intuition. 

4.1  Specific  and  General  Restrictions  gkfM.tlS  and  %(M1S 

We  recall  that  S®T  produces  a  new  composite  sensor,  with  its  associated  doc,  which  is 
always  (in  a  sense  to  be  made  clear  in  Section  6)  at  least  as  good  as  either  S  or  T.  On  the  other 
hand,  for  any  particular  tuning  t,  9k(M,t)S  is  a  sensor  which  is,  in  general,  not  as  good  as  S, 
because  it  has  only  an  M-fold  output.  The  tuning  t  determines  the  meaning  of  that  output.  It 
is  logically  equivalent  to  a  decomposition  of  the  signal  set  into  a  set  of  nonoverlapping  subsets 
Y(m=l),. .  ,,Y(m=k),  but,  in  practice,  t  may  be  represented  in  a  variety  of  ways.  In  particular, 
when  we  want  to  determine  the  boundary  S  of  a  compound  doc  $(S®T),  we  need  only 
consider  extreme  points  of  the  constituent  docs.  When  their  signal  sets  Y(S)  and  Y(T)  are 
suitably  continuous  the  boundary  is  a  continuous  set,  containing  all  the  extreme  points.  When 
the  signal  sets  are  discrete,  the  “boundary”  is  the  set  of  extreme  points. 

Note  that  the  simpler  expression  91>(2)S  represents  a  more  complex  object  than 
<%(2,t)S  since  it  is  a  set: 

9l>(2)S=|9l.(2,t)S  :  t  a  possible  tuning  of  S  j  (30) 

.milarly: 

9 fc(i)S®9»,(f)r  =  {s®fc  *»(*)$,  UWjjT}.  (31) 

Thus  the  elements  of  the  set  91 >(i)5®9t(l)  T  are  labelled  by  two  tunings:  one  for  S  and 
one  for  T.  In  the  examples  of  this  paper  those  tunings  are  represented  by  real  numbers  in  the 
unit  interval,  corresponding  to  the  probability  of  a  “false  alarm.”  This  representation  of  the 
tuning  is  possible  when  there  are  only  two  hypotheses  and  only  two  actions  or  messages. 

Any  element  s®<  e  is  itself  a  sensor,  and  it  has  a  signal  set  with  kxl 

points,  which  are  labelled  by  the  messages  coming  from  S  and  T,  under  the  tunings  selected. 
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When  an  action  a=l,2,. ..,A  is  to  be  selected  from  a  set  A  the  compound  sensor  s®t  must  itself 
be  tuned.  Such  a  tuning  is  a  decomposition  of  Y(s®t)  into  A  non-overlapping  subsets.  In  the 
examples  of  this  paper  A=2  and  the  decomposition  is  specified  by  giving  the  set  Y(a=l) 
which,  by  complementation,  specifies  the  set  Y(a=2). 

4.2  Binary  Messages 

In  the  special  case  of  binary  messages  me{0,l}  it  is  natural  to  call  the  tuning  of  s<g>t  a 
LOGIC.  The  signal  set  is  Y(s®t)  =  {00,0 1,10, 11}.  There  are  2^  — 1  =  15  non-empty  subsets  in 
the  doc  $(s®t).  Each  such  subset  corresponds  to  a  logical  expression.  For  example  {01,10} 
corresponds  to  [m(s)  =  l  or  m(t)  =  1  but  not  both],  which  could  be  expressed  as  the  exclusive  or: 
XOR(s,t). 

Corresponding  to  a  given  logic  there  is  a  fundamental  polynomial  which  appears  in 
each  row  of  the  table  characterizing  the  sensor  system.  It  is  defined  in  terms  of  the  binary 
patterns  m=(mj,"-m2)  appearing  in  the  logic: 


QLOGIc(xl»">xn)— 53  IIxi  i(1*xi)^  ^ 

meLOGIC  j=l 


Here  n  is  the  number  of  independent  sensors  for  which  a  fusion  center  has  been  used. 
This  particular  form  depends  both  on  the  fact  that  only  two  complementary  trigger  sets  arise 
at  any  sensor  (because  of  the  two-fold  messages),  and  the  fact  that  1  can  be  written  as  the 
power  x°. 

There  is  an  important  set  of  inequalities  restricting  the  LOGICS  that  can  be  extreme 
points  of  3)(s®t).  We  recall  that  the  elements  of  the  table  defining  s®t  are  products  for 
stochastically  independent  sensors.  In  the  special  case  of  two-fold  messages  we  may  simplify  the 
notation,  using: 

ds  =  d(Y(m(s)  =  l)  =  Prob{m(s)  =  l  given  h=  1)  (33) 

fs=f(Y(m(s)  =  l)  =  Prob(m(s)  =  l  given  h= 0),  (34) 

with  similar  expression  for  the  sensor  t. 

We  also  set: 
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Then  the  fundamental  table  describing  s®t  has  as  its  columns  ail  triples  of  the  form: 


m(s)m(t) 

j  mS  j  mt  T"®t 

ds  d8  dt  dt 

f  ms  T'ffls  f  mt  ?-mt 

rs  Js  rt 


(36) 


The  doc  2)  has  elements  corresponding  to  all  possible  sums  of  these  expressions. 


Without  loss  of  generality  we  may  suppose  that  Y(m  =  l)  is  chosen  so  that 

ds-.  1— d8  ~5s 

^8  ~  1— fs  fs 


(37) 


with  a  similar  relation  for  Y(m(t)  =  l). 

It  is  easy  to  see  that  if  a  “trigger  region”  contains  the  point  m(s)m(t). .  .m(n)  and  does 
not  contain  all  the  points  m'(s)mr(t). .  jn#(n)  for  which  any  m^  the  corresponding  m,  then  it 
is  not  an  extreme  point  of  the  upper  boundary  doc  of  the  composite  sensor. 

We  sketch  the  proof  for  the  case  of  two-fold  signals.  A  corresponding  result  may  be 
proven  for  fc-fold  signals  in  the  same  way.  The  statement  is  empty  if  the  region  is  {11}. 
Otherwise,  suppose  that  for  sensor  t,  the  trigger  region  contains  some  point  with  m(<)=0.  We 
need  only  show  that  the  corresponding  point  in  the  doc  is  inside  the  convex  hull  of  the  doc. 
Without  loss  of  generality  we  need  only  consider  cases  lying  in  the  triangle  d>f.  We  decompose 
the  given  point  into: 

(/.<*) =(/4+/().  dA  +  d o)  (38) 

where  (/q,  4q)  is  the  vector  in  the  doc  space  corresponding  to  the  point  with  m(i)=0  in  the 

signal  set.  Let  (/},  i^)  represent  the  vector  in  doc  space  corresponding  to  the  same  point  in  the 
signal  set,  but  with  m(l)  =  l.  We  show  that  (f,d)  is  not  an  extreme  point  of  the  doc  ©  by 
showing  that  is  lies  below  the  line  joining  the  points  A  =  (/^,<i^)  and 
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B  =  (/^  +/q  +/p  ^4  +  <^q  -f  ^l)'  Prove  ^is  we  note  that  the  /-coordinate  of 

(^/(/g  +/i)M +(/0/(/0  +fi))B  is/,  while  the  d-coordinate  is: 


dA+jo+rl(do+di> 

(39) 

~dA+J^+f-^do+hd l/4) 

(40) 

(41) 

~dA  +  dQ  =  d' 

(42) 

The  set  of  LOGICS  is  further  reduced  by  the  observation  that  every  sensor  must  play  a 
role.  For  example,  the  doc  of  any  logic  that  is  independent  of  the  message  from  sensor  S  is 
contained  within  the  doc  formed  by  multiplying  the  corresponding  polynomial  QloGIC  by 
m*.  For,  we  could  freeze  sensor  S  to  the  “ON”  position  and  recover  QlqGIC-  "^en  a  logic 
is  independent  of  the  message  from  sensor  S  it  involves  Xg  only  in  the  form  z^+z^  =  l.  For 
example,  with  two  sensors  the  logic  {10,11}  corresponds  to  Q(x)=xsl^+xsxi ~xs.  Its  doc  is 
contained  within  the  doc  of  either  {11}=Y(AND),  or  {01,10,1  l}=Y(0/t). 

Of  course  the  degenerate  cases  Y  =  {0}  and  Y  =  {00,01,10,11}  are  equivalent  to  having 
no  detector  at  all,  and  need  not  be  considered.  They  correspond  to  the  BROKEN  doc: 
©={(0,0), (1,1)}. 

These  three  principles  are  the  only  ones  that  we  know  for  reducing  the  set  of  possible 
logics.  Detailed  examples  are  given  in  [Cherikh88;  Thesis  CWRIJ]  where  the  relation  between 
these  rules  and  a  Lagrange  multiplier  formalism  for  definition  of  the  boundary  of  the  doc  is 
developed. 
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4£  Specific  Deacriotions  of  2-fold  and  3-fold  fusion. 


Using  these  rules  we  find  that  the  cases  to  be  considered  are: 

(i)  Product  combination  of  two  sensors:  <D1>(2)S^®%(2)S2: 

L0GIC={11}  AND 

={01,10,11}  OR  (43) 

(ii)  Product  combination  of  three  sensors:  lDfe(2)Sj®'3l>(2)S2®l3i>(2)S.j: 

LOGIC  =  {111}  AND 

=  {011,101,110}  MAJORITY  RULE  or  2  OUT-OF  3 
=  {011,100,101,110,111}  1  OR  (2  AND  3) 

plus  two  cyclic  permutations. 

=  {101,110,111}  1  AND  (2  OR  3) 

plus  two  cyclic  permutations. 

=  {001,010,011,100,101,110,111}  OR.  (44) 

The  specific  computations  needed  to  trace  the  extreme  points  of  the  doc  can  always  be 
written  as: 


QLOairti.f,)<r‘,LOS,zid’'i,)'  (45) 

Since  the  function  Dr(Fr)  is  monotonically  increasing  for  r=s  or  t,  the  weak 
inequality  constraint  may  always  be  replaced  by  equality  if  the  doc  set  is  continuous.  Thus,  for 
two  detectors  in  this  “fusion”  situation  the  optimization  involves  a  search  over  one  variable, 
For  three  detectors  it  involves  a  search  over  two  variables. 

1A  Series  Structures: 

When  one  of  two  sensors  is  directly  accessible,  and  the  other  is  only  accessible  over  a  k- 
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fold  channel  we  call  the  structure  “series.”  The  corresponding  doc  is  represented  by 
m>(k)Sj®S2-  We  say  that  sensor  S2  is  “downstream”  from  sensor  S^.  Each  point  in  the  doc  of 
the  combined  system  is  achievable  as  the  product  of  one  or  more  pairs  of  points  in  the  doc  of 
S2  and  some  member  of  %(2)S^.  Let  us  examine  the  structure  of  such  pairs. 

For  any  particular  choice  of  the  “tuning”  of  the  doc  of  <Dt>(2)S^  is  a  set  of  four 
points:  (0,0),  (f1,dl(f1))  and  their  reflections  under  (f,d)-*(l-f,l-d).  The  only  non-trivial  choice 
of  a  trigger  set  is  Y(m=l)— *(f1,d1)  and  Y(m=0)— >(1— fltl—  dx).  The  values  of  (f^dj)  can 
range  over  the  entire  3)(Sj).  Similarly,  the  points  in  the  doc  of  S2  can  range  over  S)^)- 

On  the  one  hand,  the  doc  3)  of  the  series  case  is  a  restriction  of  the  doc  3)  for  the  full 
sensor  product.  We  consider  first  the  discrete  case,  with  both  signal  sets  having  3  elements. 
Y1  =  {1,2,3}  and  Y2  =  {4,5,6).  The  signal  set  of  is  (14,15,16,24,25,26,34,35,36),  which 

has  9  points.  The  doc  will  consist  of  2® =512  points.  The  signal  set  of  is  somewhat 

more  complicated.  There  are  several  possibilities  for  the  signal  from  <%(2)Sp  depending  upon 
the  particular  tuning,  which  we  denote  as  9fc(2;t)S.  They  are  (0,123),  (1,23),  (2,13),  (12,3)  (4 
possibilities  in  all,  as  their  complements  provide  the  remainder  of  the  2^=8  total  range  of 
possibilities.)  However,  these  possibilities  are  not  simultaneously  available!  When  a  specific 
tuning  is  made  for  the  first  sensor,  one  of  these  possibilities  is  available  and  the  others  are  not. 
Thus  there  are  only  2x3=6  elements  in  the  Y  of  the  series  system,  and  not  9.  Finally,  if  we  are 
interested  in  finding  the  extreme  points  of  the  doc,  one  of  the  possible  combinations,  (2,13), 
will  not  be  of  interest  because  it  is  not  an  extreme  point  in  the  doc  of  of  Sj. 

One  way  to  visualize  the  relation  between  l$>(2)Sj®S2  and  Sj®S2  is  to  form  a  table 
of  the  possible  subsets  of  the  product  signal  set.  For  the  full  sensor  product,  every  element  of 
the  product  set  may  be  independently  included  in  the  trigger  set.  For  the  restricted  product 
%(2)Sj®S2,  evefy  element  in  %(2)Sj  the  elements  of  S2  may  be  chosen  independently, 
and  vice  versa.  But  this  means  that  the  elements  of  must  be  assigned  to  two  subsets  once 
and  for  all,  prior  to  the  formation  of  the  trigger  set.  Thus,  the  candidates  to  be  on  the 
boundary  8  of  the  doc  SD(%(2)Sj®S2)  correspond  to  the  following  points  in  the  signal  set  Y: 
(14,15,16,234,235,236)  or  (124,125,126,34,35,36) 
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Since  only  one  of  these  possibilities  may  be  realized  at  a  time,  one  could  not  choose  the  tuning 
{14,1245}.  The  upstream  sensor  cannot  distinguish  “2”  from  “1”  and  also  from  “3”,  because  it 
communicates  via  a  2-fold  channel. 

4.5  Comparison  of  t^e  series  structure  t£  fusion. 

Although  the  series  structure  has  a  more  limited  signal  set  (and,  hence,  a  restricted  doc 
3))  than  the  full  sensor  product,  it  is  expected  to  be  more  general  than  the  fusion  structure 
l3fe(2)S®  <3t(2)T.  This  may  be  verified  by  writing  out  explicitly  the  elements  of  the  signal  set  for 
the  4  non-trivial  possibilities  of  (s®t)c%(2)S®<%(2)T.  They  are: 

{14,156,234,2356} 

{145,16,2345,236} 

{124,1256,34,356} 

{1245,126,345,36}.  (46) 

The  first  two  of  these  are  contained  within  the  first  of  the  series  possibilities.  The  remainder 
are  contained  within  the  second.  Note  that  what  appears  as  an  elementary  possibility  in  the 
fusion  structure,  such  as  “1245”  (that  is:  S  says  1  or  2  and  T  says  4  or  5)  is  a  composite  in  the 
series  structure,  being  the  union  of  the  elements  124  and  125. 

4.6  Computation  of  a  Series  doc  2)(%(2)S^®S2)  in  the  continuous  case. 

The  series  configuration  can  be  thought  of  as  a  union  of  docs,  corresponding  to  a  set 
of  sensor  products: 


%(2)Sj<8>S2  =  {sg^:  s«'iR>(2,t)Sj  for  some  tuning  t  j.  (47) 

The  structure  of  the  specific  elements  of  the  doc  is  a  little  tricky.  In  general,  the 
elements  of  sc%(2,t)S^  are  of  the  form  shown  in  Eq.  48: 
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m=l  m=0 

Dj(ti)  D^)  .  (48) 

Fl(ti)  TO 

That  is,  there  are  only  two  points,  and  each  row  has,  as  its  coordinates,  some  point  on  the 

boundary  SB(Sj)  of  the  doc  SD(Sj).  The  elements  of  s®S2  are  sums  of  products  with  one  factor 
drawn  from  this  table  and  the  other  drawn  from  the  table  of  S2*  We  may  write  this  sensor  in 
general  as  a  table  (we  suppress  the  row  containing  the  labels): 

^i(h)  x  ^(*1)  ^(*2) 

^iOi)  ^*1(^1)  ^2(^1)  ^(*2) 

The  elements  of  the  boundary  S  of  the  doc  3)  are  included  among  all  possible  sums 
over  subsets  of  these  products.  Any  such  sum  may  be  written  as  the  sum  of  two  terms: 

d( V)  =  T  D^tjtyy)  +  ]T  Dfijd 2(y)  (50) 

V<Y  a  9«Y5 

and 

^Y)=E  +  E  (5i) 

yeYa  y(\ b 

To  find  the  extreme  points  we  need  only  consider  extreme  choices  for  the  sums;  that  is, 
we  need  only  consider  subsets  ^  which  are  the  trigger  sets  for  points  on  the  upper  boundary 
(F(t),D(t))  for  some  value  of  t,  the  tuning  of  detector  S2.  Since  there  are  two  sums  involved, 
there  are  two  tunings,  which  we  may  denote  as  t2a  and  tg^.  Hence  the  extreme  points  of  the 
doc  for  the  series  case  will  have  the  form: 
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D(Y)  —  D^(t^)D2(t2a)+D^(ti)D2(t2b) 

F(Y)=F1(t1)F2(t2a)+F^(4)F2(t2b).  (53) 


We  may  see  explicitly  that  the  doc  of  the  series  case  contains  that  of  the  two-fold 
fusion  system.  In  the  latter,  one  of  the  sets  Ya  and  Yb  will  be  either  the  empty  set  or  the  full 
signal  set.  For  example,  when  ^b-0  the  expression  reduces  to  a  point  in  the  boundary  of 

AND(*£(2)S^  ,%(2)S2). 

while  when  t2a  =  l  it  reduces  to  a  point  in  the  boundary  of  0R(%(2)Sj  ,®>(2)S2). 

All  the  other  special  cases  can  be  shown  to  lie  within  the  docs  corresponding  to  either 
the  AND  or  the  OR  logic  for  the  fusion  case.  This  confirms  that  when  the  second  sensor  gives 
up  the  freedom  to  let  its  tuning  depend  on  the  signal  received  from  the  first  sensor,  its  power  is 
reduced  to  that  of  the  situation  in  which  each  sensor  must  set  its  tunings  without  knowledge  of 
the  signal  from  the  other. 

4.7  The  puzzle  of  three-fold  fusion. 

In  the  case  of  three  detectors  we  have  found  that  there  are  9  non-dominated  LOGICS. 
Three  of  these  are  symmetric.  The  remaining  ones  form  two  familes,  each  of  which  is  closed 
and  transitive  under  permutations  of  the  three  sensors.  We  can  demonstrate,  by  explicit 
example  that  any  of  the  9  rules  may  be  needed  in  the  general  case. 

When  the  three  sensors  are  identical  we  have  not  found  any  cases  in  which  the  non- 
symmetric  rules  are  needed  to  determine  the  boundary  of  the  doc.  Thus  we  are  led  to  speculate 
that  perhaps,  when  all  three  sensors  are  the  same,  only  the  three  symmetric  rules  are  needed  to 
find  the  extreme  points  of  the  doc.  However,  the  argument  “symmetric  sensors,  therefore 
symmetric  solutions”  is  a  dangerous  one,  as  shown  in  Section  5.1.  Thus  this  question  remains 
open. 
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5.  Specific  network  results. 

5.1  Spontaneous  symmetry  breaking. 

In  many  cases  where  the  two  sensors  whose  signals  are  combined  at  a  fusion  center  are 
themselves  identical,  we  have  found  that  the  optimal  tunings  of  the  individual  sensors  are 
themselves  the  same  for  every  tuning  of  the  overall  system.  This  kind  of  symmetry  makes 
calculations  much  faster,  particularly  as  one  advances  to  systems  with  more  than  two  sensors 
or  more  than  two  messages  per  channel.  One  may  invest  considerable  energy  in  the  effort  to 
prove  that  this  symmetry  holds  quite  generally,  but  the  efforts  are  doomed  to  failure  because 
counter-examples  exist. 

We  exhibit  such  a  counter-example  here.  The  reader  will  note  that  the  degree  of 
difference  between  the  performance  of  the  system  with  optimal  tunings  of  the  indivudal 
sensors,  and  the  performance  with  suboptimal,  symmetric  tunings,  is  not  large.  Thus  it  may  be 
possible  to  prove  that  symmetric  tunings  represent  a  heuristic  for  tuning  which  comes  within 
some  provable  discrepancy  of  the  best  possible  tuning. 

The  Counter-example. 

Consider  the  sensor  defined  by  the  following  fundamental  table: 

d  .375  .537  .088 

S=  (54) 

f  .250  .390  .360 

The  upper  boundary  of  the  doc  corresponding  to  the  continuous  version  of  this  sensor 
is  given  by  linear  interpolation  between  the  points: 


By  direct  calculation  one  finds  that  the  symmetric  tunings  for  the  AND  and  OR  logic 
at  F  =  .64  are  in  fact  both  lower  than  .912,  and  so  surely  lower  than  the  best  that  can  be 
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Figure  7.1  Spontaneous  symmetry  breaking.  The  lower  curve  is  the  boundary  of  the  dor  when 
both  sensors  are  tuned  to  the  same  value.  The  up|x-r  curve  is  achieved  by  relaxing  that 
rcsLriclion,  giving  full  fusion  of  the  sensors. 

Optimol  overoll  DOC  and  symmetric  DOC 

Corrtlnuooo  v*r»ton  of  oo»*3 


Figure  7 .2  Detail  ot  spontaneous  symmetry  breaking.  The  largest  difference  is  of  the  order  of 
2%  in  the  conditional  probability  of  detection. 

Optimal  overoll  DOC  and  symmetric  DOC 

Continuous  voroion  of  cooo3  lnf.64.-M] 


Con4  probaOty  of  foioo  otocm 
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achieved  with  a  fusion  system.  That  difference  is  shown  in  Figure  7. 1,7.2. 


5.2  Non-con vexi tv  of  tji£  doc  for  fusion  systems. 


As  we  shall  discuss  in  detail  in  Section  6-8,  the  convexity  of  the  doc  SD  for  a  detector 
system  plays  an  important  role  in  the  solution  of  decision  problems,  whether  they  are 
characterized  in  terms  of  acceptable  error  rates  or  by  a  cost  matrix  relating  actions  and 
hypotheses.  Since  the  solution  of  any  decision  problem  involves  optimization  over  the  doc  2),  it 
is  made  easier  if  that  region  is  convex.  However,  the  doc  defined  by  a  fusion  system  is  shaped 
by  the  fact  that  the  action  of  fusion  corresponds  to  a  discrete  sensor,  with  only  4  points  in  its 
doc.  For  a  fusion  system,  even  though  the  distributed  sensors  have  continuous  signal  set,  and 
continuous  doc,  the  fusion  center  itself  has  a  discrete  signal  set,  corresponding  to  finitely  many 
logics.  Examination  of  the  boundary  of  the  doc  in  example  cases  shows  that  the  doc  is  actually 
the  union  of  several  convex  sets.  Although  the  intersection  of  convex  sets  will  also  be  convex, 
the  union,  in  general,  will  not.  So  it  is  not  surprising  to  see  that  there  are  small  regions  of  non¬ 
convexity  in  the  doc  of  the  fusion  system.  An  example  is  shown  in  Figure  8.1. 


This  corresponds  to  the  fusion  of  two  sensors  each  having  the  table: 


S  = 


d= 

f= 


.4 


0 


.6  0 

.6  .4 


(56) 


This  structure  means  that  each  of  the  sensors  can  send  any  of  three  signals.  The  first  is 
an  unambiguous  identification  of  the  desired  event;  the  third  is  an  unambiguous  rejection  of  it, 
and  the  middle  signal  is  perfectly  ambiguous.  As  shown  in  Figure  8.2,  there  is  a  substantial 
dimple  in  the  area  where  the  boundaries  of  the  docs  for  the  two  LOGICs  cross.  The  depth  of 
this  dimple  can  be  measured  in  “natural  units”  corresponding  to  distance  in  the  Euclidean 
metric  on  the  doc  set.  It  is  approximately  6%.  We  have  calculated  the  depth  of  the  dimple  for 
all  choices  of  sensor  having  the  general  form  used  here,  expressed  as  a  function  of  the  degree  of 
overlap  of  the  response  functions.  In  this  case  the  overlap  is  60%,  which  is  close  to  the  location 
of  the  maximum.  If  e  denotes  the  area  of  overlap,  the  depth  of  the  dimple  may  be  shown,  by 
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Figure  8.  Non-convexity  of  the  fusion  doc.  The  individual  sensors  each  have  doc  boundaries 
corresponding  to  the  curve  labeled  DOC.  The  fusion  of  two  sensors  with  binary  messages  has 
the  boundary  labelled  MAX  DOC.  There  is  a  clear  “dimple”  or  non-convex  region  at  the  point 
where  the  LOGIC  changes. 


Two  identical  detectors  in  fusion 

DOC  and  CMTRALL  (MAX)  DOC 
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Figure  8.2  Nonconvexity  of  the  doe  in  fusion.  A  fairly  extreme  ease  occurs  when  there  is  a  00% 
chance  that  each  of  the  sensors  will  give  a  completely  ambiguous  signal  yrV.  The  dimple  is 
approximately  G%. 

Figure  8.3  Nonconvexity  as  a  function  of  ambiguity.  The  depth  of  the  dimple,  measured  as 
Euclidean  distance  in  the  doc  space,  is  given  as  a  function  of  the  conditional  probability  of  the 
ambiguous  signal  from  the  component  detectors. 

Fusion  for  two  overtopping  sensors 


Cond  Prob  of  uncorlaln  mouogo 
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elementary  geometry,  to  be: 


G  AP  = 


(57) 


which  achieves  its  maximum  at: 

e=(^5  —  l)/2  =  0.618... 


(58) 


One  expects  that  non-convexity  will  be  the  rule  for  fusion  systems,  unless  a  single  logic 
dominates  all  of  the  others,  and  this  example  of  a  typical  model  of  sensor  imperfection  suggests 
that  the  effects  may  be  substantial.  (See  also  Figures  8.2  and  8.3) 

5.3  The  series  tonologv. 

In  general,  the  simplest  series  structure  is  represented  by  lDl>(2)Sj  One  sensor 

sends  a  binary  signal  to  another.  One  natural  question  is  whether,  if  one  sensor  is  definitely 
better  than  the  other,  the  good  sensor  should  be  placed  “upstream”  or  “downstream.”  As  we 
discuss  in  Section  6,  a  sensor  S  is  definitely  better  than  another  S'  if  and  only  if  the  doc  $(S) 
includes  the  2>(S/).  We  illustrate  some  of  the  complexity  of  this  problem  with  a  simple  finite 
example.  Let  two  sensors  be  described  by  the  tables: 


.40 

.35 

.15 

.10 

.25 

.25 

.25 

.25 

and: 


(59) 


.40 

.30 

.20 

.10 

.25 

.25 

.25 

.25 

(60) 


It  is  readily  verified  that  G  is  better  than  D  in  that  sense. 

There  arc  three  non-trivial  reductions  which  may  be  applied  in  this  case  —  combination 
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of  the  first  two  columns,  the  first  three  columns,  and  the  last  three  columns.  We  may  then 
form  the  direct  sensor  product  of  the  reductions  of  G  with  B,  and  vice  versa.  This  yields  a  set 
of  points  which  are  the  extreme  points  of  the  composite  structure  3)>(2)U®D.  We  use  “U”  to 
represent  the  “upstream  detector”  and  “D”  to  represent  the  “downstream  detector.”  The 
results  are  most  easily  seen  in  the  graphs  of  Figures  9.1,  9.2,  9.3  which  show  the  complete 
upper  boundaries  S’*"  of  the  two  composite  docs.  These  have  been  verified  by  direct  calculation 
based  on  the  piecewise  linear  formulation  corresponding  to  the  discrete  sensors  G  and  B.  The 
enlarged  views  show  that  in  the  neighboorhood  of  F  =  0.5  it  is  “better”  to  have  the  better 
sensor  upstream,  while  in  the  neighborhood  of  F  =  0.625  it  is  better  to  have  the  poorer  sensor 
upstream. 

As  with  the  case  of  the  spontaneous  symmetry  breaking  the  effects  are  small,  and  we 
do  not  know  whether  they  can  be  shown  to  be  small,  in  this  sense,  in  every  case.  Of  course  it 
must  be  remembered  that  differences  on  the  doc  graph  are  multiplied  by  some  scale  factor 
depending  on  the  importance  of  the  problem  to  which  the  sensor  system  is  applied. 
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Figure  9.1  Two  sensors  in  series.  The  individual  sensors  have  the  upper  boundaries  1,  which  is 
better,  and  2,  which  is  poorer.  The  two  possible  choices  for  which  sensor  is  upstream  yield  two 

composite  docs,  whose  upper  boundaries  are  close  to  each  other. 

Figure  9.2  Detail  of  Figure  9.1.  In  the  vicinity  of  50%  false  alarm  rate,  the  performance  or  the 
system  with  the  better  detector  upstream  is  superior. 

Conterexomple  for  sequencing  in  tandem 


OOC1,  OOC2  ond  D0C*s  of  ooch  dtslgn 
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Figure  9.3  Detail  of  Figure  9.3.  In  the  vicinity  of  62.5%  false  alarm  rate,  the  performance  of 
the  system  with  the  poorer  detector  upstream  is  superior.  Possible  tunings  of  the  system  with 
the  poorer  detector  upstream  are  labeled  “)"•  There  are  six  variant  tunings  of  the  two  possible 
configurations,  but  several  of  them  overlap  in  this  small  region. 


Comparison  of  R(2)BxG  and  R(2)GxB 

(  for  R(2)GxB  )  for  R(2)BxG 
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5.4  General  procedures  for  the  calculation  of  any  network. 


The  ideas  set  forth  in  Sections  1-3  permit  us  to  provide  an  algorithm  for  the  discussion 
of  any  network  in  which  signals  flow  only  one  way,  and  there  are  no  closed  loops.  We  have 
seen  that  such  a  network  be  represented  by  some  combination  of  the  basic  operations  of 
the  sensor  calculus,  restriction  91>(k)  and  full  product  <g>.  We  confine  ourselves  to  the  case  of 
two  hypotheses  and  two-fold  signals. 

The  doc  boundaries  8  of  the  root  sensors  may  be  given  in  either  analytical  form  or 
numerical  form.  Thereafter  all  calculations  will  produce  numerical  form.  The  result  of  any  such 
calculation  can  be  thought  of  as  a  table  enumerating  the  points  of  the  boundary  8,  and  the 
tunings  of  the  component  detectors  that  give  rise  to  them.  For  example,  when  the  operation  is 
a  binary  fusion  %(2,tj)S2®<%(2,t2)S2  the  table  contains  entries: 

F  D  <2  LOG,c  (61) 

where  <2  and  t ^  and  LOGIC  are  the  coordinates  of  the  optimal  solutions  to  the  problem: 

0=logTc  1 JJJF  Glogic^i^iOi))'  d2^f2^))  (62) 

subject  to  the  conditions: 

«LoGIC(*l(<l>’  W)  =F  <63) 

and: 

(W’  W* l))*®^)  054) 

( W’  W)£®(52)-  <65) 

In  practice,  for  the  case  of  two  hypotheses  and  two  actions,  the  value  of  Fj(tj)  may  be 
used  to  represent  the  tuning  as  well.  If,  for  reasons  of  economy,  or  reliability,  the  choice  of 
LOGIC  is  fixed  then  the  table  will  only  contain  the  values  of  (F,D,t2,t2). 

For  the  case  of  fusion  of  more  than  two  sensors,  the  general  form  will  be  the  same,  but 
the  enumeration  of  LOGICs  will  be  more  complex. 

For  the  series  structure  $>(2,tu)U®D  the  table  is  slightly  more  complicated, 
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containing: 

F  D  tu  t^l  td2 

where  tu,  tdl  and  td2  solve  the  problem: 

D—  max  Dfj(tu)Dj)(t4i)+il  —  Djj{tu)\  Djy(td 2) 

‘tt*  ‘dlt  ld2 


‘a*  *di»  ‘d2 

subject  to  the  conditions 

—  ^ £/■(*«)!  F D(i<12)^F  (®8) 

and: 

(F^tjj),  ^(t^Wp)  (69) 

fFDC <di)»  DD(tdl))e»(D)  (70) 

(7*) 

Such  tables  may  be  used  to  determine  the  doc  boundary  S  for  any  composite  into 
which  these  composites  enter  as  components.  In  the  practical  application  of  such  a  system  all 
the  tables  must  be  maintained  available  for  use.  The  overall  detector  system  boundary  S  is 
used  to  determine  the  overall  optimal  tuning,  as  described  in  Sections  6-8.  That  tuning  is  then 
looked  up  in  the  overall  system  tuning  table  to  determine  the  optimal  tunings  of  the  major 
subcomponents.  This  lookup  process  is  iterated  until  the  tuning  of  each  fundamental  sensor  in 
the  network  has  been  determined,  as  well  as  the  tunings  of  the  intermediate  sensors. 
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6.The  link  between  decision  making  and  the  doc. 

In  the  Bayesian  formulation  a  problem  is  described  in  terms  of  probabilities,  actions 
and  hypotheses.  In  general  there  is  a  cost  matrix  C(a,h)  defined  for  aeA,  the  set  of  actions  and 
AeH,  the  set  of  hypotheses.  In  general  one  might  permit  the  cost  function  to  also  depend  on  the 
probability  that  each  of  the  several  hypotheses  is  true,  in  which  case  we  might  write  it  as 
C[a,p),  where  p—(p(h—  1),. .  ,,p(h=  H)).  The  probabilities  have  some  specified  values  p°  prior  to 
the  observation,  and  are  updated  to  the  values  p ,  by  the  observations  of  the  sensor  network.  In 
the  general  case  the  cost  may  be  a  non-linear  function  of  p.  For  example,  the  cost  may  increase 
more  rapidly  as  the  leakage  through  a  defensive  system  increases,  and  the  number  of  survivors 
decreases.  Thus  improvement  in  detection  yields  diminishing  returns.  In  another  setting, 
improvement  in  detection  will,  in  general,  bring  diminishing  returns  because  the  precision  of  a 
measurement  increases  only  as  the  square  root  of  the  number  of  detections. 

We  restrict  ourselves  in  this  paper  to  the  customary  engineering  cost  model,  in  which 
the  cost  function  is  linear  in  the  individual  probabilities,  and  may  be  written  as: 

CTM  =  ^C'(“,A)P(A)-  (72) 

AeH 

In  this  case,  the  cost  of  choosing  a  given  action  may  be  expressed  as: 

EC(acl\)  =  Y,  P*{a  and  h)C(a,h).  (73) 

AeH 

If  the  action  a  is  taken  whenever  the  signal  y  is  in  the  trigger  region  Y(a),  this 
becomes: 

«*«)=£  P*4(Y(a))C(fl,A).  (74) 

AeH 

Note  further  that,  for  this  formulation,  in  which  the  expected  cost  is  to  be  minimized 
without  additional  constraints,  the  cost  matrix  C[a,h)  and  the  prior  probability  enter  only  in  a 
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single  combination:  W(a,A)=;>°0(a,A).  This  can  be  thought  of  as  a  vector,  labelled  by  a,  whose 
several  components  are  indexed  by  A.  With  this  perspective,  the  expected  cost  is  simply  the 
inner  product  of  the  vector  J(Y(a)),  whose  components  are  /^(Y(a)),  and  the  vector  W(a). 

Thus  solution  of  the  Bayesian  decision  problem  amounts  to  finding,  for  each  a,  the 
tuning  Y(a)  which  minimizes  the  overall  expected  cost.  In  the  general  case  this  will  be 
accomplished  by  assigning  each  element  jeY  to  that  action  a(y)  for  which  W(a(y))<  WiV)  for 
all  other  actions  a'.  We  concentrate  on  the  case  of  exactly  two  hypotheses  and  two  actions.  In 
this  case  we  need  only  specify  one  trigger  set,  Y(a=l),  with  the  other,  Y(o=0),  defined  by 
complementation.  Similarly,  the  space  in  which  the  vectors  /  and  W  lie  has  only  two 
dimensions.  As  in  Section  4,  we  will  refer  to  its  axes  as  the  /  and  d  axes,  corresponding  to  A=0 
and  A=1  respectively.  The  components  of  f  are 

It  can  be  shown  by  direct  calculation  [Blankenbecler,Kantor88]  that  fgt  this  case  the 
expected  cost  depends  only  on  certain  differences,  which  may  be  thought  of  as  the  cost  of  two 
kinds  of  error: 

CtA=l)  =  Cta=0,A=l)-CTl,l)  (75) 

and 

0(A=0)  =  0(o=l,A=0)-CtO,0).  (76) 

We  assume,  without  loss  of  generality,  that  both  of  these  quantities  are  positive.  (If  they  are 
both  negative  we  should  relabel  the  actions.  If  they  have  opposite  signs  then  one  of  the  actions 
is  to  be  preferred  whatever  the  state  of  nature,  and  no  sensor  system  is  needed.)  This 
particular  simplification  is  unique  to  the  case  of  two  actions,  in  which  the  difference  vector 
ffc=W(a=l)— W(a=0)  can  be  used  to  determine  whether  W(o=l)-^y)  ^  W(a=0)\fly). 

By  direct  calculation  we  find  that  the  expected  cost  may  be  rewritten  as: 

EC—  £  E  plfhW*))C{a}h)  (77) 

A=  1  o=l 

=£  +  pj4(Y(0))q02,A)]  (7o, 


I 


vV«\V 


:l 


/  V  vu 
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We  use  the  fact  that  there  are  only  two  possible  actions  to  write  j{Y(2))  =  E—j{\(l)), 
leading  to 

EC-  W;0)-f?+(W:i)-W:0))-J(Y(l)).  £=(1,1)  (H  terms).  (80) 

Thus  the  problem  of  minimizing  the  expected  cost  is  the  same  as  minimizing  the  value 
of  the  second  dot  product.  This  can  be  visualized  as  sweeping  a  hyperplane  perpendicular  to 
R=W(1)— W(0)  across  the  doc  until  it  reaches  an  extreme  point.  When  the  dot  product  is  as 
small  as  possible  the  corresponding  choice  of  the  tuning  Y(<r=l)  is  optimal.  We  show  this 
construction  graphically.  Note  that  the  two  terms  of  R  are,  using  for  “0”,  and  recalling 
that  a=0  means  “do  not  act”: 


Ri=P;qi,i)-P;o(«=o,i) 

A=1  (“<T) 

(81) 

A/=Pjai,2)-p§a«=o,2) 

A=2  (“/”) 

(82) 

These  represent  the  a  priori  risk  associated  with  the  two  possible  states  of  nature. 
Presumably  the  first  is  negative  and  the  second  is  positive.  Their  ratio  determines  the  slope  of 
the  line  that  sweeps  across  the  doc.  For  the  line  is  nearly  vertical,  which  favors  a 

tuning  very  close  to  (0,0).  This  is  reasonable  since  the  risk  of  an  incorrect  response  (false 
alarm)  is  relatively  great,  inhibiting  us  from  action.  Conversely,  when  tunings  close 

to  (1,1)  are  preferred  because  a  miss  would  be  very  costly. 

No  matter  what  the  nature  of  the  doc  —  be  it  discrete  or  continuous  —  the  solution 
to  Bayesian  problems  will  be  found  at  the  extreme  points  of  the  doc.  If  it  is  discrete  these  are 
isolated  points,  as  in  Figure  10.1.  If  it  is  continuous  all  the  points  of  the  boundary  are  extreme 
points  (Figure  10.2).  Thus,  in  one  way  or  another,  everything  that  we  need  to  know  about  the 
doc  is  contained  in  a  “listing”  of  its  extreme  points.  This  may  be  given  in  closed  analytic  form 
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Figure  10.1  Bayesian  problems:  Discrete  case.  The  cost  and  prior  probabilities  determine  a 
direction,  represented  here  by  level  lines.  The  solution  to  the  problem  is  to  move  as  far  in  that 
direction  as  possible,  without  leaving  the  doc.  As  the  direction  rotates,  the  optimal  tuning 
remains  “stuck”  at  a  vertex  of  the  doc,  until  it  is  ready  to  jump  to  another  one. 

Figure  10.2  Bayesian  problems:  Continuous  case.  When  the  boundary  of  the  doc  is  continuous, 
the  level  line  for  the  optimal  tuning  rolls  around  the  doc,  and  the  tuning  point  changes 
continuously. 
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(rarely),  in  a  tabular  form  with  interpolation  rules,  or  by  direct  enumeration. 

To  sum  up,  the  solution  of  any  Bayesian  problem  is  reduced  to  complete  knowledge  of 
the  boundary  of  the  doc.  Since  the  boundary  of  the  doc  also  provides  all  the  information 
needed  to  carry  out  the  operations  of  the  sensor  calculus,  we  consider  an  alternative  way  to 
characterize  it. 


6.2  The  Nevman  Pearson  Formulation. 

In  general,  the  extreme  points  of  the  doc  are  all  those  points  through  which  a 
hyperplane  may  be  passed  without  including  any  interior  points  of  the  doc.  In  the  general  case 
the  interior  points  are  all  convex  combinations  of  extreme  points  that  are  not  themselves 
extreme  points.  One  way  to  enumerate  the  extreme  points  is  to  consider  all  possible  Bayesian 
problems,  and  find  the  solutions  for  each.  Because  the  information  about  prior  probabilities 
and  about  costs  enters  only  through  a  single  ratio,  this  would  be  a  highly  redundant 
enumeration.  A  more  efficient  approach  is  to  consider  all  values  of  the  determining  ratio 
RJRf. 

Yet  another  method  is  to  trace  out  the  extreme  points  by  gradual  relaxation  of  an 
artificial  constraint.  This  approach  is  familiar  from  statistics,  where  it  defines  the  operating 
characteristic  of  a  test,  and  has  given  rise  to  the  name  “Receiver  Operating  Characteristic.” 
Interestingly  enough,  this  terminology  has  made  its  way  back  into  statistics  as  well. 
[Kraemer88,  Swets72,  Swets88].  The  general  theory  of  most  powerful  tests  was  developed  by 
Neyman  and  Pearson  [Neyman42]  and  so  we  refer  to  this  approach  as  the  Neyman-Pearson 
formulation  of  the  optimal  discrimination  and  decision  problem. 

Points  on  the  boundary  of  the  doc  can  be  characterized  as  either: 

D{F)=  max  d  (83) 


F(D)=  min  f. 

(/,d)f  doc,  d>D 
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This  approach  has  been  used  to  provide  independent  proofs  of  the  convexity  of  the  doc, 
and  of  the  fact  that  the  boundary  is  piecewise  differentiable.  [Cherikh88].  When  the  doc  is 
continuous,  the  function  D(F)  has  a  simple  relation  to  the  effective  cost  vectors  W  at  the 
optimum  tuning.  Since  the  tuning  point  t*  is  on  the  boundary  we  have: 


(W(a=l)-W(a=2)).(F(t*),D(t*))  (85) 

is  a  minimum,  or: 

^[F(O*/-0O*)|tf(,|]  =  O.  (86) 

or: 

F'(f)Rf  =  D'(1*)\Rd\.  (87) 

But  D,(t*)/F'(f)  =  dD/dF\f.  (88) 


That  is,  the  slope  of  the  boundary  D(F)  at  the  tuning  point,  is  given  by  Rj/^R^.  In 
practice  I?{F)  is  often  most  easily  found  in  the  parametric  form.  Because  the  slope  is  itself 
monotonically  decreasing,  many  fast  algorithms  exist  for  finding  the  optimal  tuning. 


The  formulation  just  given,  for  the  Neyman  Pearson  problem,  represents  the  simplest 
kind  of  constrained  optimization.  There  are  realistic  situations  in  which  another  kind  of 
constraint  arises.  Consider  a  situation  in  which  there  is  an  expected  series  of  incidents, 
numbering  I  in  all.  Suppose  that  the  available  budget  of  responses  is  B,  and  that  the  prior 
probability  is  that  a  fraction  p^  will  be  “true  events”  corresponding  to  k=  1,  while  a  fraction 
?2  will  be  “non  events”  corresponding  to  A=2.  Then  when  the  system  is  tuned  to  the  operating 
point  D(t *)^  the  expected  total  number  of  responses  will  be: 

^pJ^O  +  p^f*).  (89) 

If  ER  is  less  than  B  there  is  no  problem.  On  the  other  hand,  if  the  number  of  responses 
exceeds  the  budget  allowed  then  some  fraction  of  the  incidents  will  not,  in  fact,  be  responded 
to.  The  effective  performance  in  this  case  is  reduced  to  (B/R)(F(t*),D(t*)).  This  point  is  a 
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convex  combination  of  the  points  (F(t*),D(t*))  and  (0,0)  and  so  is  interior  to  the  doc.  It  is 
therefore  not  optimal  for  any  choice  of  the  prior  probabilities  and  cost  information.  Thus  the 
optimal  tuning  will  be  the  tuning  for  which  the  expected  number  of  responses  is  equal  to  the 
budget. 


So,  generally,  the  solution  to  the  problem: 


max  W-f 


subject  to  the  constraints: 


ft  doc 


and  p°  f<B/I 


will  lead  to  a  point  which  is  on  the  boundary  of  the  doc  and,  for  some  choices  of  the  cost 
matrix,  also  on  the  line  given  by  p^-f=B/I. 


If  the  doc  is  continuous  this  does  not  pose  any  problems.  Every  point  on  the  boundary 
of  the  doc  is  accessible  by  a  suitable  choice  of  the  tuning  Y(a).  However,  if  the  doc  is  discrete  it 
may  happen  that  the  intersection  of  the  boundary  (strictly  speaking,  of  the  convex  hull  of  the 
extreme  points)  with  the  constraint  given  by  Equation  (92)  will  not  be  a  point  of  the  doc.  In 
the  most  general  analyses  of  optimal  design  of  experiment  [Blackwell54]  this  is  dealt  with  by 
using  a  “mixed  strategy.”  Under  a  mixed  strategy,  points  on  the  line  connecting  two  elements 
of  the  doc  are  achieved  by  using  each  of  the  corresponding  strategies  a  fixed  fraction  of  the 
time,  with  random  selection  of  which  strategy  is  to  be  used  at  any  given  time. 


For  a  single  sensor  one  may  implement  such  a  random  strategy  by  broadening  the  bins 
into  a  continuum,  and  then  choosing  a  tuning  which  effectively  mixes  the  bins  in  fixed 
proportions.  For  example,  the  sensor  described  by  the  table: 
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S1=  .75  .25 
.25  .75 

may  be  replaced  by  a  sensor  with  continuous  signal  set  Y  =  [0,1]  and  the  response  functions: 

f0(y)=1- 

,  ,  x  /  3  0<  y<.25 

fl(y)-j  Xjz  .25<y<l 

The  graph  of  the  upper  boundary  of  the  doc  is: 


'={  ■ 


3F  0<  F<.25 
75+(F-.25)/3  .25<F<1. 


It  is  clear  that  this  doc  has  no  gaps  in  its  boundary.  In  terms  of  the  original  sensor 
table,  a  tuning  such  as  F  =  .5  corresponds  to  a  mixture: 

(l/SJS^tuned  to  Y(a)  =  {l,2})+(2/3)S1(tuned  to  Y(a)={l})  (96) 

with  the  false  alarm  rate: 

F=(l/3)l+(2/3)(.25)=0.5  (97) 

and  the  detection  rate: 


D  =  (1/3)1 +(2/3)(. 75) =5/6. 
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7.  Continuity  and  discontinuity  in  the  behavior  of  network  detector  systems. 

It  has  been  noted  elsewhere  [Blankenbecler,Kantor88]  that  even  for  a  simple  model 
problem,  it  may  happen  that  the  tuning  of  a  fusion  system  jumps  discontinuously  when  the 
LOGIC  changes  from  AND  to  OR.  At  the  same  time,  it  was  observed  that  the  optimal  cost 
corresponding  to  the  best  tuning  does  not  exhibit  a  discontinuity.  We  are  now  in  a  position  to 
explain  both  of  these  phenomena,  and  to  comment  on  their  significance  for  the  optimal  design 
of  distributed  systems. 

First  we  note  that,  for  a  discrete  sensor,  there  will  be  discontinuities  of  tuning,  as  the 
line  representing  the  constant  value  of  the  cost  “toIIs  around”  the  boundary  of  the  doc, 
touching  at  the  extreme  points.  However,  the  cost  associated  with  the  best  tuning  for  a  given 
value  of  R  is  continuous  as  the  direction  of  R  varies.  This  is  because,  when  R  is  such  that 
either  of  two  tunings  is  optimal  (i.e.  the  line  of  constant  R-f  is  an  extreme  edge  of  the  doc) 
then  the  cost  associated  with  each  of  the  the  two  tunings  is  the  same. 

Exactly  the  same  phenomenon  can  occur  when  the  doc  2D  of  a  fusion  system  exhibits 
non-convexity  which,  as  we  discussed  in  Section  5,  is  a  general  occurrence.  We  see  that  for  a 
certain  critical  value  of  the  vectors  W  the  extreme  value  of  the  cost  will  occur  at  two  distinct 
points  on  the  boundary  of  the  doc,  corresponding  to  two  different  choices  of  the  logic.  With 
simple  fusion,  as  the  difference  vector  R  moves  the  optimal  tuning  will  jump  suddenly  from  the 
value  appropriate  for  the  AND  logic  to  that  appropriate  for  the  OR  logic.  The  “cost”  at  the 
minimum  will  not  show  any  discontinuity  because  the  distance  between  two  parallel  lines, 
measured  in  any  direction,  is  the  same  no  matter  where,  along  the  parallel  lines,  the 
measurement  is  taken. 

In  practice,  this  could  have  very  serious  consequences.  A  network  will,  in  general,  be 
tuned  to  our  best  present  understanding  of  the  costs  and  prior  proabilities.  If  the  general 
characteristics  of  the  sensors  are  such  that  the  optimum  is  at  or  near  this  point  of 
discontinuity,  then  we  are  less  confident  than  we  would  otherwise  be  that  our  tuning  is  the  best 
one.  To  take  the  worst  case:  suppose  that  we  had  to  tune  right  at  the  ambigous  point.  We 
must  make  a  choice  of  LOGIC,  and  of  the  tuning  of  the  individual  sensors.  If  we  select,  for 
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example,  AND,  and  the  actual  ntuations  (coats,  priors)  is  such  that  the  tangent  line  is  a  little 
steeper  than  we  think,  everything  is  fine.  We  will  be  operating  slightly  above  the  optimium 
tuning  (F(t*),  D(t*)),  but  not  very  much.  However,  if  the  tangent  is  slightly  flatter,  we  will  be 
operating  quite  far  from  the  optimal  point,  with  corresponding  loss  in  system  performance.  As 
the  slope  of  the  tangent  goes  to  zero  the  cost  of  wrong  LOGIC  falls  to  zero.  The  cost  due  to 
being  on  the  wrong  lobe  (that  is,  choosing  the  wrong  LOGIC)  is  not  the  same  as  the  difference 
between  the  two  functions  D^jj(F)  and  Dqj^(F)  measured  at  the  same  point  F. 


An  example  is  given  in  Figures  11.1,  11.2,  based  upon  the  sensors  described  in  the 
imbedding  model  of  [Blankenbeder,  Kan  tor  88].  The  fundamental  table  is: 

Y=[0,oo] 

(99) 

/0(y)  =  »e*ny 


The  upper  boundary  8+  may  be  given  in  closed  form: 

w=£fc-//n(i-gl) 


In  plotting  the  dependence  of  cost  on  the  slope  of  if  it  is  convenient  to  use  a  reduced 
measure  of  cost:  J=f~\R^/Rj\i.  The  true  value  of  the  difference  in  performance  depends  upon 
the  scale  factor  Rj,  which  may  be  extremely  large.  In  figures  11.1  and  11.2  the  independent 
variable  is  the  angle  between  the  line  of  constant  cost  and  the  vertical  axis.  This  angle 
arctan(\Rj/Rf\)  varies  from  0  to  x/2  as  the  optimal  point  sweeps  around  the  upper  boundary 
of  the  doc. 
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Figure  11.1  Optimal  tuning  in  fusion.  For  the  embedding  model  of  Equations  (99,100)  the 
optimal  tuning  is  always  symmetric.  For  two  values  of  the  parameters  n  and  i  we  show  that  as 
the  direction  of  the  cost  minimization  rotates,  the  tuning  changes  discontinuously,  as  the  logic 
changes  from  AND  to  OR.  The  tuning  point  jumps  from  one  lobe  of  the  boundary  of  the  doc 
to  another. 

Figure  11.2  Expected  cost  in  fusion.  For  the  same  cases  as  in  Figure  11.1,  the  expected  cost, 
measured  in  reduced  units,  docs  not  show  any  discontinuity. 


Optimal  tunings  for  Bayes  problem 
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Figure  11.3  Tuning  and  cost  in  fusion.  The  critical  point  corresponds  to  the  line  AB.  When  the 
critical  value  is  reached  the  tuning  jumps  from  point  A  to  point  B,  while  the  reduced  cost,  J, 
does  not  jump. 


Two  identical  detectors  in  fusion 

DOC  and  OVERALL  (MAX)  DOC 


Cond  probability  of  falso  alarm 
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8.  Resource  constraints  and  mixed  strategies. 

We  have  seen  in  Section  7  that  the  finiteness  of  the  set  of  LOGICs  leads  fusion  systems 
to  exhibit  some  of  the  discontinuities  of  discrete  systems.  As  might  be  expected,  this  problem 
also  affects  the  situation  of  constrained  resources.  We  saw  in  Section  6  that,  when  resources  are 
constrained,  the  optimal  tuning  may  correspond  to  a  point  which  is  not  in  the  doc  of  a  discrete 
system.  It  can,  in  some  cases,  be  achieved  by  a  mixed  tuning,  or  by  a  suitable  broadening  of 
the  signal  set,  which  amounts  to  the  same  thing. 

In  the  case  of  a  fusion  system,  the  problem  manifests  itself  as  shown  in  Figure  12.  The 
resource  constraint  passes  through  the  “dimple”  in  the  boundary  of  the  doc.  With  mixed 
strategy  one  could  achieve  the  value  corresponding  to  the  point  Q.  With  a  pure  strategy 
(definite  tuning)  one  cannot  do  better  than  the  point  P  at  which  the  two  boundaries 
(corresponding  to  the  LOGICs  AND  and  OR)  meet.  In  dimensionless  units,  the  added  cost  we 
must  bear  is  given  by  the  depth  of  the  dimple.  The  maximum  perpendicular  distance  from  the 
boundary  of  the  doc  to  the  line  segment  forming  the  convex  hull  across  the  dimple  is  an  upper 
bound  (in  these  absolute  units)  for  the  added  loss  due  to  the  non-convexity  of  the  full  doc. 

In  response  to  this  problem  one  might  ask  why  we  do  not  propose  that  a  mixed 
strategy  be  used,  to  avoid  the  added  cost.  The  problem,  it  seems  to  us,  is  that  mixed  strategy 
requires  coordinated  random  retuning  at  each  of  the  distributed  sensors,  as  well  as  at  the 
fusion  center.  The  communication  costs  of  coordinating  the  retuning  are  likely  to  be  higher 
than  the  cost  of  adding  to  the  communication  capacity  of  the  system  itself,  with  corresponding 
gain  in  system  performance.  Thus,  in  the  design  stages,  one  should  avoid  constructing  fusion 
systems  for  which  the  dimples  in  the  overall  system  doc  are  likely  to  be  in  regions  of  operating 
interest. 
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9.  The  problem  of  team  action 


In  the  preceeding  sections  we  have  concentrated  on  the  case  of  two  actions  and  two 
hypotheses.  Another  problem  of  some  interest  [Tenney81]  is  that  in  which  there  are  only  two 
actions  to  be  taken,  but  each  of  several  agents  may  take  them  independently.  This  problem 
may  be  discussed,  with  considerable  complication,  by  using  the  language  of  updated 
probabilities  and  likelihood  thresholds.  However,  the  same  constructs  that  we  have  used  above 
also  make  it  easy  to  discuss  this  problem. 

To  fix  notation  for  this  section,  let  F  represent  the  probability  of  false  alarm  and  D(  F) 
represent  the  probaility  of  detection  at  a  particular  sensor.  (In  other  words,  we  let  F  itself 
stand  for  the  general  timing  variable  t  that  defines  the  region  Y(a=act)).  With  N  different 
sensors  there  are,  in  fact,  2^  different  actions,  corresponding  to  which  subset  of  the  stations 
“choose  to  act.”  We  suppose  that  all  of  the  stations  have  identical  impact  on  the  cost  function, 
so  that  the  cost  depends  only  on  how  many  of  the  sensors  act,  and  not  on  which  ones  they  are. 
Specializing  further  to  the  case  of  two  stations  we  see  that  the  cost  matrix  will  have  three 
columns  and  two  rows: 


C (a , h) 


Nymbsi:  ageing 

h=l 

hsQ. 

0 

C(0,1) 

C(0,0) 

l 

C(l,l) 

C(1,0) 

2 

C(2 , 1) 

C(2 ,0) 

(101) 

We  may  reasonably  suppose  that  C(0,1)  is  the  largest  cost  element,  and  C(0,0)  is  the 
lowest  cost  element.  It  is  also  clear  that  C(j  +  l,0)>C(j,0).  It  is  almost  certainly  the  case  that 
C(0,1)>C(2,0).  That  is,  the  cost  of  the  disease  is  greater  than  the  cost  of  the  cure.  Any  further 
assumptions  are  debatable,  depending  on  the  effectiveness  of  isolated  action,  the  cost  of 
resources  consumed  in  responding,  and  so  forth.  For  example,  if  a  single  response  is  totally 
effective  then  C(l,l)=C(l,0)=the  cost  of  making  one  response,  r’ld  C(2,1)=C(2,0)>C(1,1) 
because  further  resources  are  consumed  unnecessarily.  When  a  single  response  is  not  certain  to 
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be  effective  this  inequality  is  likely  to  be  reversed. 

Let  Fj  stand  for  the  false  alarm  rate  corresponding  to  the  tuning  of  the  j—  sensor  and 
Dj  stand  for  the  corresponding  probability  of  detection.  The  overall  expected  cost  of  this 
timing  is  then: 

EC(FvF2) =Pq(oC0, 0)7^ + C*1,0)[/’1^+  Tx  F2]+ C(2,0)Fl  f2) 

+pj(c(0, 1)25^  +  C{l,l)[Dll^+  TTlDQ\+C(2,l)D1D2)  (102) 

The  variables  F ^  and  F2  are  free  to  range  independently  over  the  unit  square  in  FpF2 
space.  Our  formulation  of  this  problem  is  quite  different  from  the  treatment  in  sections  1-8. 
We  have  not  explicitly  separated  the  problem  into  the  determination  of  a  doc  and  the  selection 
of  an  optimal  point.  The  expected  cost  function  here  directly  involves  both  the  operating 
characteristic  and  the  priors  and  cost  parameters.  Thus  there  will  be  a  separate  problem  to 
solve  for  each  choice  of  the  parameters.  However,  the  problem  of  determining  the  two  doc 
boundary  functions  represented  by  D^F^)  and  D2(F2)  can  be  solved  once  and  for  all.  They 
can  be  incorporated  as  subroutines  in  an  overall  optimization  program  to  find  the  minimum 
cost  tuning.  The  Kuhn-Tucker  conditions  of  this  optimization  problem  are  the  coupled 
equations  of  [Tenney81]. 

An  alternative  way  of  thinking  of  this  problem  is  to  note  that  it  is  a  specific  restriction 
of  the  case  of  the  unrestricted  product  of  the  sensors,  with  four  possible  actions.  The  restriction 
is  that  the  trigger  regions  in  the  combined  signal  set  must  have  a  simple  product  form 
Y  =  Y1xY2.  (See  also  [Sadjadi].) 
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10.  Summary  and  Conclusions. 

We  have  seen  that  the  concept  of  the  doc  —  a  convex  set  whose  boundary,  in  the 
simple  case  of  binary  hypotheses  is  the  familiar  Receiver  Operating  Characteristic  —  provides 
a  useful  unifying  foundation  for  the  discussion  of  both  discrete  and  continuous  sensor  systems 
in  a  common  language.  This  is  particularly  important  because  continuous  approximations  to 
discrete  systems  are  a  convenient  way  of  achieving  mixed  strategies  (See  Section  8)  while 
discrete  approximations  to  continuous  situations  represent  the  realities  of  signal  binning.  We 
have  made  every  effort  to  develop  the  language  and  notation  in  a  way  that  will  survive 
transition  to  the  case  of  more  than  two  actions  or  more  than  two  hypotheses.  Although  this 
complicates  the  discussion  of  some  of  the  most  familiar  cases,  we  believe  that  it  is  worth  the 
effort. 

We  have  shown  that  the  doc  3)(S),  and  the  set  of  its  extreme  points  B^(S), 
represented  by  D(F)  provide  all  the  information  needed  to  solve  any  Bayesian  problem,  for 
either  coordinated  or  team  action.  We  have  further  introduced  a  powerful  notation  for  the 
calculus  of  sensors,  built  on  the  two  fundamental  operations  of  full  sensor  product  S®T  and 
the  M-fold  restriction  representing  either  messaging  or  action,  %(M)S. 

Using  this  machinery  we  have  found  a  number  of  “negative  results”  contradicting 
certain  plausible  beliefs  about  basic  properties  of  networks.  Specifically,  we  have  shown  that: 

(1)  The  fact  that  sensors  are  identical,  and  that  their  messages  are  combined  in  a 
symmetrical  fashion  at  a  fusion  center  does  not  imply  that  the  best  tuning  for  the  individual 
sensors  is  symmetrical  itself.  An  example  is  given  in  Section  5.1. 

(2)  When  one  sensor  is  definitely  better  than  another  (that  is,  its  doc  completely 
contains  that  of  the  poorer  sensor)  it  is  not  necessarily  the  case  that  it  is  better  to  combine  full 
information  from  the  better  sensor  with  a  restricted  message  from  the  poorer.  Nor  is  the  reverse 
the  case.  An  example  is  given  in  Section  5.3  in  which  the  doc’s  for  both  possible  architectures 
are  calculated,  and  it  is  shown  that  neither  contains  the  other. 

(3)  When  signals  from  several  sources  are  to  be  combined  to  determine  an  action  there 
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may  be  discontinuities  in  the  optimal  tuning,  corresponding  to  the  fact  that  the  tuning  of  the 
fusion  center  itself  is  a  discrete  selection  from  a  set  of  several  possible  LOGICs. 

(4)  Even  though  the  optimal  tuning  for  each  of  the  sensors  in  a  network  will  be  a 
deterministic  tuning  (corresponding  to  a  point  on  the  boundary  of  its  own  doc),  it  is  not  the 
case  that  deterministic  tuning  of  the  network  as  a  whole  is  always  optimal.  We  show,  by 
example,  that  when  there  are  resource  limitations,  as  well  as  cost  criteria,  the  best  possible 
deterministic  fusion  system  may  still  be  suboptimal. 

Our  results  confirm  the  view  that  the  development  of  an  optimal  architecture  based 
on  distributed  sensors  is  a  difficult  problem.  We  have  shown,  by  example,  how  one  may 
construct  the  boundary  of  the  doc  for  a  complex  system  and  may,  once  an  optimal  tuning  t* 
for  the  whole  network  has  been  chosen,  “climb  back”  up  the  structure  to  determine  the  optimal 
tunings  of  all  of  the  constituent  sensors. 

There  are  two  important  directions  for  immediate  exploration.  One  is  to  find  the  most 
efficient  possible  algorithms  for  implementing  the  sensor  calculus.  We  have  been  using  straight¬ 
forward  grid  search  to  test  various  preconceptions  about  symmetry  and  dominance. 
Particularly  because  symmetrical  solutions  are  not  generally  optimal,  the  calculation  for  large 
numbers  of  sensors  may  become  prohibitive,  unless  better  algorithms  can  be  found. 

The  second  important  direction  is  to  establish  bounds  on  the  magnitude  of  the  sub¬ 
optimality  represented  by  our  various  examples.  For  example,  if  it  could  be  shown  that 
symmetric  solutions  are  always  within  .5%  of  the  optimal  tunings  in  fusion,  then  it  might,  in 
many  cases,  be  acceptable  to  use  the  suboptimal  symmetrical  tuning  with  substantial 
computational  savings.  Similarly,  if  it  could  be  shown  that  one  series  arrangement  is  always 
superior  to  another  to  within  a  similar  small  difference,  it  might  be  acceptable  to  eliminate  the 
opposite  architecture  from  consideration. 
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Summary 

In  this  paper,  optimal  control  theory  ia  applied  to  the 
deaign  of  decentralized  aensor  ay  sterna.  Lagrange  inequality 
multipliers  are  used  to  determine  the  optimal  deaign  param¬ 
eters.  Several  models  of  possible  response  functions  are  fully 
discussed  as  examples  of  our  technique. 

1.  Introduction  and  Definition  of  the  Problem 

There  are  many  situations  in  science  and  engineering  in 
which  information  is  gathered  from  a  variety  of  sensors  and 
must  be  abstracted  or  summarised  for  future  processing  in  or¬ 
der  to  comply  with  communication,  storage,  or  processing  con¬ 
straints.  The  simplest  example  is  the  case  in  which  a  binary 
decision  must  be  made  based  upon  information  sources  that 
are  constrained  to  transmit  a  binary  signal.  Examples  include 
data  from  devices  monitoring  the  performance  of  a  power  net¬ 
work,  data  from  an  array  of  elementary  particle  detectors,  the 
coordination  of  radar  or  infrared  signals,  and  so  on. 

In  general,  the  communication  restrictions  may  be  lifted 
with  some  increase  in  coat;  thus  the  examples  under  discussion 
represent  a  special  case.  As  we  shall  see,  even  this  simple  case 
(two  alternative  states,  two  possible  actions,  two-fold  signals, 
and  two  detectors)  presents  challenging  problems  of  analysis. 
Discussions  have  been  given  by  Srinivasan  for  more  than  two 
detectors1  and  with  applications  to  a  specific  choice  of  the  de¬ 
tector  characteristics.2 

Discussion  of  a  case  with  distributed  action  is  given  by  Ten¬ 
ney  and  Sandell.3  A  discussion  for  specific  (series)  topologies 
is  given  by  Ekchisn  and  Tenneyl.4  Related  problems  have  been 
discussed  by  Chair  and  Varshney,5  by  Reibman  and  Nolte,8 
and  by  Sadjadi.7 

Quite  generally,  the  performance  of  an  entire  network  is 
summarised  by  four  probabilities  Pr(v,  H),  of  which  only  two 
are  independent.  (Here,  H  =  Ho.  Hi  represents  two  hypothesis 
about  the  world  and  y  —  y o,yi  represents  two  possible  actions 
or  determinations.  This  notation  will  be  made  more  precise 
shortly.) 

Several  problems  may  be  formulated,  including 

(i)  min  Pr(vi,Ho)  subject  to  p,(yo,Hi)  <  p°,. 

(ii)  min  pr(tft>,Hi)  subject  to  Pr(vi,Ho)  <  p°. 

(Ui)  min  Aprfo.Ho)  +  Bpr(yo,Hx). 

The  first  and  second  problems  correspond  to  setting  ac¬ 
ceptable  error  rates;  the  third  arises  when  there  is  a  tradeoff 
between  the  two  types  of  error.  The  coefficients  A,  B  may  be 
positive  or  negative. 

The  physical  characteristics  of  an  individual  detector 
constrain  the  achievable  values  of  Pr(yi,  Ho)  and  Pr(vo,  Hi). 
The  deaign  of  a  network  is  then  a  selection  from  among  a  dis¬ 
crete  set  of  topologies,  with  each  topology  tuned  to  give  its 
best  possible  performance.  The  tuning  is  a  constrained  opti¬ 
mization,  with  the  constraints  determined  by  the  achievable 
values  of  pr(yi,Ho)  and  p,(yo,Hi). 

‘Work  supported  by  the  Department  of  Energy,  contract 
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Our  work  has  many  points  of  contact  with  previous  work. 
We  utilize  a  Lagrangian  formulation  to  deal  with  the 
optimization  problem  involving  equality  and  inequality  con¬ 
straints.  Three  problems  are  presented  in  detail,  involving  the 
cases  of  exponential  response  functions,  special  sums  of  expo¬ 
nentials,  and  block  functions.  We  trace  the  behavior  of  the 
system  tuning  and  the  optimal  coat  as  a  function  of  the  detec¬ 
tor  discrimination. 

This  paper  is  the  first  of  a  aeries  whose  goal  is  to  clarify 
the  relations  between  topics  in  distributed  detection,  optimal 
control,  and  experimental  design,  thereby  leading  to  a  more 
intuitive  or  “physical”  understanding  of  the  problems  of  dis¬ 
tributed  detection  and  sensing. 

1.1  General  Introduction 

There  are  two  possible  states  (of  the  world)  Ho  and  Hi. 
The  prior  probabilities  of  these  two  states  are  po  end  pi ,  where 

P0  —  Prior  (Ho)  .  . 

Pi  —  Prior(H)  . 

There  are  two  possible  courses  of  action  ( “measures” )  denoted 
by  mo  and  mi. 

The  assumed  cost  function  is  C(m,H),  where 
C(mo,H0)  ~  Vo  C(mo,H{)  *  u,  +  u>0i 

C(mi,H>)  **  «o  +wio  C[mi,Hi)  *  «i  . 

The  expectation  value  of  the  coat  function  is  to  be  minimized 
over  the  various  design  parameters,  those  in  the  response  func¬ 
tions  and  those  in  the  probability  functions.  As  will  become 
clear  later  in  our  discussion,  the  separate  cost  parameters  uo 
and  «i  do  not  matter  when  the  expected  cost  is  minimized; 
the  minimum  depends  only  on  a  ratio  involving  the  differences 
in  the  coat  for  a  given  Hj,  namely  tvio  and  tooi. 

The  essential  point  is  that  for  the  case  of  only  two  possible 
states  of  the  world,  the  preferred  action  is  determined  by  a 
single  real  number,  determined  by  the  posterior  odds  for  the 
Hj.  This  is  true  because,  using  linear  cost  theory,  the  informa¬ 
tion  in  Eq.  (2)  is  summarised  by  the  intersection  point  of  two 
straight  lines;  one  describes  the  coat  of  action  mo  as  a  function 
of  po  while  the  other  describes  the  coat  of  action  mi- 

l.t  Propertie »  of  the  Integrator 

For  our  model  we  choose  a  fusion  structure  in  which  signals 
are  processed  locally  at  each  detector,  with  messages  fed  to  a 
•ingle  integrator 

A  — ►  C  « —  B  . 

The  problem  is  to  deaign  an  intc^'ator  C  and  tune  the 
sensors  (A,B).  Each  of  the  two  sensors  detects  some  signal 
(y)  and  sends  the  central  integrator  a  signal  u,.  In  general, 
these  signals  need  not  be  binary.  The  integrator  then  chooses 
action  mo  or  mi,  and  this  choice  is  determined  by  the  fusion 
rules.  The  rules  for  both  the  sensors  and  the  integrator  are  to 
be  chosen  so  that  the  expected  coat  is  minimized. 

The  integrator’s  actions  are  completely  described  by  a 
matrix  (with  two  adjustable  parameters)  that  describes  the 
probability  of  choosing  measure  mo,  given  the  signals  v, 
from  detector  i  m  a,  b.  This  matrix  will  be  denoted  by 
p(tn,|u.,u»),  where 
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p(mo|0,0)  =  1  p(mo|0,l)  =  9  . 

p(mo|l,0)  =  d  p(mo|l,l)  =  0 
or,  in  an  alternative  matrix  notation, 

«)s0  Uj  =  1 

i  s  )•  «> 

The  probability  of  choosing  mj  must  be  the  complement, 
element  by  element, 

p(mi|u)  =  l-p(mo|u).  (5) 

We  exclude  the  possibility  of  a  third  course  of  action.  The 
design  parameters  g  and  d  are  to  be  fixed  by  the  optimisation; 
they  define  the  rule  to  follow  when  the  two  detectors  disagree. 
If  the  two  detectors  are  identical,  then  we  expect  that  d  =  g 
and  that  they  will  be  0  or  1  depending  on  the  costs  and  the 
details  of  the  sensitivities  of  the  detectors. 

1.3  Definition  of  the  Detectors 


Now  consider  the  detectors  in  more  detail.  Each  detector, 
labeled  a  or  6,  produces  a  single  “meter  reading"  y,,  (t  =  a,  6), 
in  response  to  the  state  of  nature.  The  probabilities,  Pt{y,H), 
that  the  value  of  the  reading  is  y  for  the  state  of  the  environ* 
ment  H  for  each  detector  is 

Detector-*  a  b 

Pi(  vi  Bo)  f§( y)  fo(y) 

p<(y;*i)  /? (y)  /f(y)  • 

The  quantity  y  must  now  be  converted  to  a  yes  or  no  signal 
(u  =  0  or  u  =  I). 

The  effect  of  the  decision  process  at  each  detector  may  be 
summarized  completely  by  a  table  giving  the  decision  strategy 
or  probability  of  response  p,( u;H)  for  each  of  the  detectors. 
For  detector  a: 


Ho 

Hi 

«*»  =  0  / 

(  a 

1  -A 

u.  =  l( 

vl-« 

A 

Ho 

Hi 

«k  —  o  / 

'  b 

1  -B 

=  1  \ 

l-b 

B 

P.(«.;H)  = 
while  for  detector  b: 


1.4  Design  Parameters 

Therefore,  the  full  set  of  parameters  to  be  determined  by 
optimization  is 

g,d  a,A,b,B  . 

The  first  two  describe  the  operation  of  the  “integrator"  that 
processes  the  two  signals  C  A  the  sensor  stations  to  form  the 
output  decision  m.  The  last  four  describe  the  operation  of 
the  “sensors"  —  they  take  the  detected  signals,  apply  their 
respective  detection  criteria,  and  form  their  individual  output 
signals  u. 

1.5  Properties  of  the  Detectors 


A  generalized  detector  uses  the  rule:  if  the  signal  y  is  in  the 
region  R,  then  the  signal  uo  is  sent  to  the  integrator.  Similarly, 
if  the  signal  *  in  the  complement  of  R,  i.e.,  if  ytR,  then  ui 
is  sent. 

If  the  external  state  is  indeed  Ho,  then  the  response  func¬ 
tion  of  the  detector  is  /o(y),  but  if  it  is  Hi,  the  detector  re¬ 
sponds  with  /i(y)  [see  the  table  below  Eq.  (5)]. 

For  any  choice  of  R,  the  detection  probability  [see  Eq.  (6)[ 


=  j  <fy/o(y) . 


and  this  then  implies  for  A, 

A  -  f  dvf\{y)  ■  (°) 

A 

Similar  relations  hold  for  6  and  B.  If  the  response  functions 
have  interlaced  maxima,  then  the  region  R  (and  R)  may  be 
disconnected. 

As  R  expands,  clearly  R  contracts.  For  any  fixed  value  of 
a  there  is  a  maximum  and  a  minimum  possible  value  for  A. 

If  the  response  functions  /o(y)  and  /i(y)  overlap,  which  is  the 
general  and  expected  case,  then  these  limits  on  the  value  of  A 
have  important  consequences. 

The  possible  values  A  for  a  fixed  value  of  a,  are  traversed 
as  the  region  R  is  varied.  It  is  clear  that  to  make  A  as  large  as 
possible  for  a  given  value  of  a,  R  should  contain  those  points 
whose  contribution  to  A  would  be  as  small  as  possible  (i.e.,  the 
ratio  f\j fo  small)  while  the  complement  contains  those  points 
with  large  values  of  this  ratio.  This  is  the  familiar  likelihood 
ratio  threshold  rule. 

If  a  goes  to  1,  then  A  goes  to  zero.  Also,  if  A  is  1,  then  a 
must  vanish.  This  follows  trivially  from  the  unit  normalization 
of  the  response  functions. 

Finally,  note  that  an  ideal  detector  with  perfect  discrimi¬ 
nation  has  response  functions  that  satisfy  /o(y)  x  fx  (y)  =  0  for 
all  y.  In  this  case,  the  values  of  a  and  A  are  independent.  We 
will  return  to  this  limiting  case  shortly. 

2.  The  Cost  Function 

The  expected  value  of  the  cost  function  is 

(c>  =  £  iff,) ,  (io) 

where  p(m|H)  is  directly  expressed  in  terms  of  the  detector 
properties,  and  we  assume  that  the  signals  received  by  the 
detectors  are  stochastically  independent: 

p[mi\H})  =  J2  p(m<|u.,ttt)p(«a|flj)p(«»l^j)  •  (11) 

Using  the  explicit  form  of  the  cost  matrix,  Eq.  (2),  (10)  can  be 
expressed  as 

(C)  =  woi  p(mo|Hi)pi  +  “>10  P(»"i|^o)po  +  «oPo  +  «i  Pi  ■ 

(12) 

Additive  constants  do  not  matter  in  the  minimization;  the 
last  two  terms  are  fixed,  and  are  the  cost  for  an  ideal  system. 
For  such  a  system  with  perfect  discrimination,  the  off-diagonal 
probabilities  p(mo|Hi)  and  p(mi|/?o)  both  vanish  since  A  = 
1  —  a.  The  cost  must  be  a  minimum: 

(Omi»  «  “oPo  +  wiPi  •  (13) 

The  quantity  that  we  want  to  minimize  is  the  additional  cost 
due  to  imperfections  in  the  system;  this  has  the  form 

{6C)  *  (C)  -  {Omin  (14) 

«  woi  p(mo|^i)pi +wjo  p(mi|J?0)po  • 

Note  that  the  position  of  the  minimum  will  depend  on  the 
ratio 

W  =  ,  (15) 

woi  Pi 

which  is  the  relative  expected  cost  of  being  wrong  if  the  state 
of  the  environment  is  Ho  (and  responding  with  mj )  compared 
to  the  cost  of  being  wrong  if  it  is  Hi  (and  responding  with  mo). 
The  magnitude  of  the  minimum  cost  will  depend  multiplica- 
tively  on  the  factor  tvoi  pi. 

It  is  convenient  to  rewrite  the  cost  function  as 

(«C)/(u*,p,)  ,  (16) 

or 

J  =  p(molHi)  +  W[1  -  p(mo|Ho)!  •  (17) 

The  minimization  of  the  expected  value  of  the  cost  is  equivalent 
to  minimizing  J. 


I 


I 

I 


Some  interesting  limit*  on  J  emu  now  be  determined.  The 
perfect  detector  he*  J  *  0.  It  i*  amusing  to  note  thet  a  detec¬ 
tor  thet  i*  aim My*  wrong  ha*  J  =  1  +  W .  (One  would  then  uae 
•uch  e  detector  ‘backward.”)  A  more  interesting  case  follows 
from  noting  thet  tlW  is  sufficiently  smell,  i.e.,  the  cost  wjo  (of 
erroneously  choosing  mi )  is  smell,  then  a  good  strategy  is  to  al¬ 
ways  choose  mi.  This  implies  that  p(mo|fTi)  **  p(mo|ffo)  *  0, 
end  J  =*  W  (end  p  =  d  =  0).  If,  on  the  other  hand,  W  is 
larger  then  1,  then  one  went*  to  always  choose  mo;  in  this 
limit,  J  =  1  (and  g  —  d  —  1).  The  final  cost  for  this  limit¬ 
ing  case  may  be  expressed  in  terms  of  the  step  function  #(x) 
(•(*)=*  l,x  >  0,8(x)  =  0,*  <  0) 

g  _  d  -  0(W  -  1)  . 

This  result  arise*  in  another  way.  If  the  response  functions  are 
the  same,  /o(y)  =  fi(y),  then  no  discrimination  is  possible, 
and  we  find  A  *  1  -  a.  Using  this  relation  in  the  probabilities, 
we  find  the  above  result  by  choosing  the  obvious  optimum. 

The  general  optimisation  problem  consists  of  choosing  the 
design  parameters  so  the  expected  cost  lies  as  far  below  Jmaz 
as  possible  and  as  close  to  the  ideal  case,  J  =  0,  as  possible. 
We  now  turn  to  a  general  discussion  of  the  problem  of  finding 
extrema  when  the  constraints  define  a  connected  subset  of  the 
real  line  for  each  variable. 

S.  General  Minimization  with  Inequalities 

Using  the  form  of  the  probabilities  defined  in  Eqs.  (6)  and 
(7),  one  finds  the  explicit  expressions 

p(mo\Hi)  =  l-p(mi|ffi), 

=  (1- A)(l-B)+g(l- A)B  +  dA(l-B)  , 

(19) 

and 

p(mo|Jo)  =  l-p(mi|Ifo)  ,  . 

*  ab  +  g  a(l  —  6)  +  d  (1  —  a)b  . 

Using  Eqs.  (19)  and  (20),  the  minimisation  problem  can  be 
re-cast  explicitly  as 

J  =  W  +  [S  +  gT+dU]  ,  (21) 

where 

S  =  (1  -  A)(l  -  B)  -  Wab 
T  =  (1  -  A)B  —  Wa(l  —  b)  (22) 

V  =  A(1  -B)-  W  (l  —  o)6  , 

and  all  the  variables  must  satisfy  inequality  constraints.  A 
complete  mathematical  treatment  for  problems  of  this  type 
can  be  found  in  the  excellent  book  by  Hestenes.*  A  reference 
that  discusses  such  variational  problems  in  a  language  perhaps 
more  familiar  to  physicists  and  engineers  is  available.9 

To  minimize  J,  in  the  case  that  the  variables  g  and  d  occur 
linearly  in  J,  but  have  a  restricted  range  from  sero  to  one,  it 
is  convenient  to  form  the  variational  functional  Jw,  where 

«/*«»  *  J  —  7g(l  -  j)  -  <d(l  —  d)  .  (23) 

The  optimum  will  be  a  saddle  point  in  (*1,6)  versus  (p,d).  In 
this  case  J  is  a  linear  function  of  g  and  d,  hence  the  extrema  will 
occur  at  the  endpoints.  The  Lagran-t  inequality  multipliers 
7  and  6  must  be  sero  if  their  associated  variable  g  or  d  is 
inside  the  allowed  range,  and  non-negative  if  they  are  on  the 
boundary.9  As  usual,  the  derivative  with  respect  to  g  must 
vanish  at  the  minimum  and  this  yields  the  condition 

0  =  T  -  7(1  -  2g)  .  (24) 

This  takes  the  place  of  paired  Kuhn-Tucker  conditions  for 
g  >  0  and  g  <  1.  If  T  is  nonzero,  which  is  the  typical  case,  then 
the  minimum  must  be  on  the  boundary  (7  cannot  be  sero).  If 
T  is  positive,  then  g  vanishes;  if  negative,  then  g  is  unity.  A 


similar  argument  holds  for  d  and  U.  The  result  can  be  ex¬ 
pressed  as 

»  =  #(-T)  (25) 

d  *  $(-U)  ,  K  1 

and  the  minimum  of  JMr  becomes 

jm  .  w  +  (s  +  r#(-r)  +  u  #(-u)i .  (28) 

Note  that  if  T  or  V  vanish,  there  is  no  uncertainty  in  the 
minimum  of  J ,  even  though  g  and  d  are  not  determined. 

The  variables  left  to  consider  are  a,  A,  and  6,  B.  Each  of 
these  variables  has  a  restricted  range,  so  inequality  multipliers 
will  again  be  used.  As  was  noted  before,  the  possible  values 
of  A  are  limited  by  the  form  of  the  response  function  and  the 
value  of  a.  This  can  be  expressed  as  the  statement  that  for  any 
choice  of  the  region  R,  with  a  given  by  (8),  one  must  have 

^m»(®)  —  A(U)  <  Am**  (a)  .  (27) 

Of  course,  similar  restrictions  apply  to  B. 

These  inequalities  can  be  treated  as  above.  Write  the  vari¬ 
ational  functional  in  the  form 

J»»r  —  Jm  ~  F  —  /  ,  (28) 

where 


/  s  a  a(l  —  a)  +  0  6(1  —  6) 


range  and  non-negative  if  they  are  on  the  boundary. 

Now  the  variation  with  respect  to  A  yields 

2ou(A-A*.r)  -  ,  (31) 

where  A>*r  —  (Ann  +  Amos)/ 2. 

It  is  a  straightforward  task,  though  somewhat  tedious,  to 
discuss  the  general  case.  First  note  that  the  above  equation 
becomes 

2 <m(A-A*„)  =  +(l-B)  +  B#(-r)-(l-B)8(-U)  .  (32) 
Since  the  right-hand  side  is  never  negative,  a  a  cannot  vanish, 
and  hence  A  must  be  at  its  boundary.  Since  must  also  be 
non-negative,  it  follows  that  A  must  be  above  A**r.  Repeating 
the  same  argument  for  B  we  find  that 

A  =  A«(a)  , 

B  =  BmazW  ■ 

These  are  computable  functions  of  a  and  6  given  the  response 
functions  of  the  detectors.  They  correspond  to  the  so-called 
Receiver  Operating  Characteristic  used  in  several  of  the  pa¬ 
pers  cited  above.  We  shall  term  these  functions  the  DOC,  or 
Detector  Operating  Characteristic,  and  they  will  play  a  funda¬ 
mental  role  in  our  analysis. 

The  next  stage  is  to  vary  a  and  6  within  their  allowed  range 
to  achieve  the  overall  minimum.  One  can  anticipate  that  there 
may  be  symmetric  (a  =  6)  and  nonsymmetric  minima;  which 
particular  one  is  the  global  minima  must  be  determined  from 
a  more  detailed  examination  using  the  explicit  forms  for  the 
response  functions.  This  will  be  carried  out  in  the  explicit 
examples  discussed  in  the  next  section.  First  let  us  discuss  the 
boundary  behavior  in  a  and  6. 

Double  boundary :  The  boundary  region  in  which  both  vari- 
ables  are  at  their  limits  consists  of  four  terms.  They  will  be 
denoted  by  L(a,b)t  where  a  and  6  can  take  on  the  values  zero 
or  one. 

L(0,0):  For  this  case,  A  =  B  =  1 ,  and  Jm  =  W  for  all  W . 


(28) 

L* 

1 

r-a 

*  -  B) ,  A 

(29) 
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A 

(30) 
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V, 

.  and  0 

allowed 

$ 

L(1,0)  and  L(0,1):  For  these  cues,  A  =  0  B  =  1,  or  the  re- 
verse,  end  [S  =  0  =  U,  T  =*  1  -  W  end  g  =  t(W  —  1)  j 

Jm  =  W  +  [l-W)»{W-l),  (34) 

the  JAM*  discussed  eerlier. 

1(1,1):  For  this  limit,  A  *  B  =  0, 

Jm  =  1  •  (35) 

Therefore,  the  minimum  of  Jm  on  this  double  boundary  is  al¬ 
ways  given  by  Eq.  (34)  which  amounts  to  setting  the  detectors 
to  always  signal  oppositely.  Now  let  us  turn  to  the  single  vari¬ 
able  boundaries. 

Single  boundary:  This  boundary  region  is  symmetric  in  both 
variables  and  hence  we  need  only  treat  the  cue  in  which  b  is 
at  its  limits  while  a  is  in  the  interior.  The  reversed  situation 
will  yield  the  same  minima.  These  will  be  denoted  u: 

L(a,0):  For  this  case,  B  =  1,  and  S  =  U  =  0.  The  quantity 
T  is  not  zero,  with 

jm  =  w  +  r«(-r) 

T  »  1-A-Wa.  ^ 

As  noted,  A  should  be  equal  to  its  maximum  value  for  a  fixed 
value  of  a  in  order  to  achieve  the  minimum  value  of  T,  u  wu 
shown  earlier.  The  limit  cues  of  a  *  0  and  a  =  1  are  on 
the  double  boundary.  Any  minimum  for  a  in  the  interior  must 
satisfy 

diT(p)  _  dAmax  (a)  _  w 

da  ~  da  (37) 

=  0  . 

Since  Amax(a)  is  a  decreasing  function  of  a,  there  will  in  general 
be  a  solution  in  this  region  if  W  is  in  an  appropriate  range.  This 
could  yield  a  smaller  minimum  than  that  given  by  the  double 
boundary  result,  Eq.  (34);  however.for  this  cue,  we  have  g  =  1 
and  d  is  not  determined,  but  its  value  does  not  matter  since 
U  =  0  and  one  can  arbitrarily  choose  d  *  1  also. 

L(a,l):  For  this  situation,  B  =  0  =  T  and  S  =  1  -  A  -  Wa, 
with  U  =  1  —  W  —  S.  If  £7  is  positive,  then  Jm  =  I,  while  if 
it  is  negative,  then  Jm  =  W  +  S.  Both  these  cases  have  arisen 
before,  and  there  are  no  new  minima  of  Jm. 

Interior:  In  the  interior  region,  the  inequality  multipliers  must 
vanish  and  the  standard  variational  equations  become  symmet¬ 
ric  in  form.  One  can  safely  auume  that  there  will  be  minima 
in  this  region,  but  whether  any  is  the  global  minimum  requires 
detailed  study.  Note  that  generally  there  will  be  (local)  minima 
with  T(and/or  U)  both  positive  and  negative  with  the  corre¬ 
sponding  limiting  values  of  g  (and  d).  Since  this  is  a  standard 
well-discussed  variational  problem,  further  general  treatment 
here  is  not  necessary.  Let  us  now  turn  to  a  exhaustive  discus¬ 
sion  of  some  explicit  examples. 

4.  Exponential  Response  Functions 

Consider  a  detector  with  response  functions  given  by 
fo  =  *»Aexp{-»Ay} 
h  =  A  exp{— Ay}  .  (  ] 

We  will  auume  that  n  is  greater  than  one  without  any  lou 
of  generality,  so  that  the  likelihood  ratio  (/i//o)  is  leu  than 
one  for  y  <  *,  where  A*  =  (<nn)/(n  -  1).  Using  the  above 
argument,  to  achieve  the  extrema  of  A  for  these  monotonic 
response  functions,  the  region  R  must  be  either  the  range  below 
or  the  range  above  some  point  x  whose  value  will  be  determined 
by  the  optimization  process.10 

Therefore,  it  is  easy  to  see  that  there  are  two  cues  to 
discuss: 


Cate  I 

0  <  y  <  x 
x  <  y  <  oo 


Cate  II 
x  <  y  <  oo 

0  <  y  <  X 


a  1  -  exp{— «Ax}  exp(-nAx) 

A  exp(-Ax)  l-exp(-Ax) 

A  (1  -  a)1/*  1  -  a‘/«  . 

Thus,  A  must  lie  in  the  region 

1  -  a1/n  <  A  <  (1  -  a)1/*  ,  (39) 

and  its  position  in  this  interval  is  determined  by  the  particular 
choice  of  the  region  R.  Similar  relations  hold  for  b  and  B. 

As  an  example,  consider  the  cue  n  =  2,  and  then  the 
feuible  region  for  A  u  a  function  of  a  is  labeled  F  in  the 
graph  shown  in  Fig.  1. 


Fig.  I.  The  allowed  region  of  A  is  plotted  for 
n  s=  2  u  a  function  of  a  and  labeled  F. 

Figure  2  shows  the  graph  for  the  value  n  =  4. 


Fig.  2.  Same  u  Fig.  1  but  with  n  =  4. 

We  see  that  u  n  increases,  the  allowed  region  increues  to 
eventually  include  all  values  of  A  between  zero  and  one. 

1.1  Explicit  Minimization 

Using  the  general  results  derived  in  the  previous  chapter, 
we  have  A  and  B  at  their  maximum  allowed  values: 

A  =  (i-a),r  («) 

B  =  (1  -  fc)1/n  . 
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find  no  point  where  J  wu  below  the  value  at  the  symmetric 
minimum  given  above. 

Note  that  aa  a  function  of  W  s  wioPo/woi  Pi.  the  largest 
fractional  improvement  in  coet  is  achievable  when  W  *  1.  This 
is  precisely  the  case  in  which  the  prior  choice  of  action  is  a 
matter  of  indifference,  that  is, 

«*o  Po  +  **i  Pi  +  wot  Pi  31  <*o  Po  +  wto  P0  +  V1P1  .  (54) 

This  is  intuitively  reasonable,  as  one  expects  the  information 
from  the  sensor  to  be  the  most  valuable  in  this  case. 


5.  Invariant  Imbedding 


We  now  consider  a  detector  whose  response  functions  allow 
superior  discrimination  between  the  two  possible  states  of  the 
environment  and  contains  the  previous  example  as  a  special 
case.  The  general  form  that  allows  a  smooth  limit  back  to  the 
previous  model  is 

fa  —  nA  exp{— nAy) 


h  = 


-W*-1  ;x  «P{— A»}  [1  -  s  exp{— nAy}]  .  ^ 


n  +  1  —  x 

For  values  of  x  near  1  this  allows  the  improved  separation  be¬ 
tween  /o  and  f\  since  the  former  is  large  at  y  =  0  while  the 
latter  is  small  there.  On  the  other  hand,  for  x  equal  to  *ero, 
this  is  the  model  of  the  previous  section. 

It  will  again  be  assumed  that  n  is  greater  than  1,  and 
proceeding  as  before  we  find: 

Cate  1 
0<  y  <x 
*  <  y  <  oo 


R 

R 


Cate  II 

x  <  y  <  oo 
0  <  y  <  * 


a 

A 


1  —  exp{-nAx} 

n  «*p{-**> 

x  [n  +  I  -  *  exp{-nAx}] 


exp{-nAx} 


1  '  n-t-K- 
x  [n  +  1  -  s  exp{— nAx}] 


l-e1/"  1  + 


n+T-T 


Thus,  A  must  lie  in  the  region 

1  -  al/n  [l  +  *1*  ~-a)  <  A  <  (1  -  a)*/"  [l  + - — - ] 

[  n  +  1  —  zj  ’  [  n  + 1  —  zj 


(56) 


Its  position  in  this  interval  is  determined  by  the  particular 
choice  of  the  region  R.  Similar  relations  hold  for  5  and  B. 
Note  that  the  allowed  region  of  A  increases  as  z  increases  from 
aero  to  one. 


To  provide  maximum  contrast  with  the  previous  model  we 
will  present  data  for  the  value  z  =  1.  For  this  case,  the  interior 
symmetric  minimum  exists  for  all  W  values.  An  interesting 
new  behavior  is  found  in  this  model  for  small  enough  W  and 
n;  the  minimum  cost  occurs  for  g  =  d  =*  0,  as  before,  but  as 


W  increases. 

these  dei 

tign  parameters  flip  to  g 

=  d  = 

1.  The 

value  of 

a  at  the  minimum  jumps  diteorUinuoxuly. 

n  =  2  n  =  4 

W 

a 

J 

J/Jmt. 

a 

J 

J/Jmtt 

0.1 

0.46 

0.088 

.88 

0.70 

0.068 

.68 

0.2 

0.58 

0.160 

.80 

0.80 

0.111 

.56 

0.25 

0.63 

0.191 

.764 

0.82 

0.128 

.51 

0.26 

0.64 

0.197 

.758 

0.828 

0.131 

.504 

0.27 

0.24 

0.203 

.752 

0.832 

0.134 

.496 

0.3 

0.26 

0.22 

.733 

0.84 

0.143 

.48 

0.5 

0.35 

0.31 

.62 

0.89 

0.191 

.38 

1.0 

0.50 

0.47 

.47 

0.94 

0.268 

.27 

1.5 

0.60 

0.56 

.56 

0.96 

0.316 

.32 

2.0 

0.68 

0.63 

.63 

0.97 

0.35 

.35 

4.0 

0.79 

0.77 

.77 

0.985 

0.44 

.44 

6.0 

0.85 

0.83 

.83 

0.990 

0.49 

.49 

8.0 

0.88 

0.87 

.87 

0.993 

0.52 

.52 

10.0 

0.90 

0.89 

.89 

0.994 

0.55 

.55 

The  columns  are  the  same  as  in  the  previous  table.  Note  that 
the  parameters  for  the  minimum  (for  n  =  2)  shows  a  definite 
jump  as  W  passes  through  the  value  e  0.265.  At  this  point, 
the  optimum  values  of  g  and  d  change  from  zero  to  one;  in  fact, 
we  find  that  g  =  d  =  f{W  -  Wo),  where  Wo  as  0.265. 

On  the  other  hand,  for  »  *  4,  the  quantities  T  and  U  are 
always  positive,  so  that  g  =  d  =  0.  There  does  not  appear  to 
be  a  discontinuity  in  a. 

At  the  discontinuity,  the  eott  varies  smoothly.  This  is  in¬ 
tuitively  reasonable,  since  cost  is,  ultimately,  determined  by 
the  position  of  some  tangent  hyperplane,  along  a  normal  to 
the  feasible  region,  which  is  connected.  However,  the  jump 
in  design  parameters  could  have  serious  consequences  because 
a  small  variation  in  the  (frequently  subjective)  data  summa¬ 
rized  by  the  parameter  W  could  require  a  complete  change  of 
the  system  parameters  g  and  d.  This  phenomena  has  impor¬ 
tant  implications  for  the  design  of  constant  false  alarm  rate 
systems,  which  will  be  discussed  elsewhere.11 

6.  A  Step  Function  Example 

We  now  consider  a  detector  whose  response  functions,  in 
a  certain  limit,  allow  a  clean  discrimination  between  the  two 
passible  states  of  the  environment.  In  that  limit,  A  — •  1  does 
not  force  a  to  zero.  We  will  assume  simple  ‘square”  response 
functions  for  ease  of  presentation.  The  response  functions  are 
chosen  to  be  zero  for  y  >  3  and,  of  course,  normalized. 

Proceeding  as  before  we  find  for  Case  I,  0  <  y  <  x: 


For  the  range 

0  <  x  <  1 

1  <  x  <  2 

2  <  x  <  3 

/o(*) 

o 

•< 

1 

H 

Ao 

0 

/«{*) 

0 

At 

(1  -  A,) 

a 

(l-Ao)x 

1  +  Ao(x  -  2) 

1 

A 

1 

l-At(x-l) 

(1  -  A,)(3  -  x) 

A  similar  table  can  be  evaluated  for  Case  II,  x  <  y  <  3.  If 
either  Ao  or  Ai  vanish,  then  this  describes  an  ideal  detector 
system. 

We  need  the  value  of  Ama*  for  a  fixed  value  of  a  which  is 
Amtt  *  1  -  ^(o  -  1  +  A0)«(o  -  1  +  Ao)  ,  (57) 

while  the  minimum  value  is 

Am<„  =  ^(Ao-a)tf(A0-a)  .  (58) 

The  feasible  region  for  A  as  a  function  of  a  ,  labeled  F  in 
the  graph  (Fig.  3),  is  bounded  by  straight  lines: 

In  the  limit  that  either  Ao  or  Ai  vanishes,  the  allowed  region 
for  A  covers  the  unit  square. 

It  is  a  simple  matter  to  analyse  this  problem  for  the  mini¬ 
mum  J  corresponding  to  the  maximum  allowed  A  and  B  values 
as  given  above.  Consider  the  cases: 

1.  A  =  f?  =  1,  and  a  =  6  =  (1  -  Ao).  For  these  values,  T 
and  U  are  negative  and  J  =  W Ao5. 

2.  A  ■  I  ■  ('  -  At)  and  «  =  4  =  1.  For  this  case  T  and  U 
are  now  positive  and  J  —  At1. 

Thus  the  final  result  can  be  expressed  as 
J  =  min[WAo*  ,  A,*] 
g  =  d  =  •(V-Ao’W) 

«  =  5  =  l-Ao4(A,1  -  Ao’W)  J 

A  =  B  =  1-  A,#(Ao*W-  A,*)  . 

The  limit  of  perfect  discrimination,  Ao  and/or  At  going  to  zero, 
can  be  easily  discussed  from  the  above  results. 


Varying  Jm  with  respect  to  a  and  b  and  introducing  inequality 
multiplier*  a  and  0  to  keep  thaee  variable*  between  aero  and 
one,  we  find  the  condition* 

««(1  -  2a)  =*  A*-(l  -  B)  -  nWb 

n0(l  -  24)  *  -  A)  -  nWa  [  ' 

whoee  eolution  should  contain  all  relevant  minima.  Let  us  ex¬ 
amine  the  boundary  and  interior  minima  in  that  order.  Recall 
that  n  >  1  in  the  following  discussion,  and  we  have  assumed 
for  the  moment  that  T  and  V  are  positive.  This  will  be  proven 
shortly  for  our  solutions. 

Boundary:  The  double  boundary  region  has  been  discussed  in 
general  and  the  result  is  a  minimum  of  the  form  (a  =  0,  b  —  1 
or  a  *  1,4  *  0) 

Jm  =  W  +  (1-IV)#(I-W)  =  jw  .  (42) 

L(a,0):  For  this  single  boundary  problem,  the  task  is  to  find 
the  minimum  of  T,  where  T  =  1  —  AmMI  -  W a.  The  result  is 
with  p  =  l/(n  —  1) 

*>  - 

"  m  <«> 

40  » W  ' 

For  Ao  to  be  less  than  one,  nW  >  1.  The  value  of  T  at  this 
minimum  is  negative,  and 

Jm(Bnd)  «  l  -  —  Ao  .  (44) 

fl 

If  nW  is  slightly  larger  than  one,  nW  =  1  +  t,  then  it  is  easy 
to  see  that  to  lowest  order 

Jm(Bnd)  —  Jm+t  -  •  (45) 

L(a,l):  For  this  case,  B  *  0  =  X  and  U  *  l  —  W  —  S.  If  C?  is 
negative,  then  the  minimum  of  Jm  is  1.  If  it  is  positive,  then 
S  must  be  minimized,  and  this  is  just  the  problem  discussed 
above. 

In  summary,  Jm  has  a  minimum  on  the  boundary  given  by 
Eq.  (42)  or  Eq.  (44),  depending  on  the  value  of  nW. 


Interior:  In  the  interior  region,  the  inequality  multiplier*  a 
and  0  must  vanish  and  Eqs.  (41)  become  symmetric  in  form. 
Thu*  there  is  a  symmetric  solution  with  a  —  b  and  A  —  B. 
Unsymmetric  solution*  will  be  searched  for  later.  In  the  sym¬ 
metric  case,  the  equation  for  the  optimal  probability  A  is 

nWA*~ '(I -A")  =  (1-A)  ,  (46) 

which  does  not  have  an  analytic  solution  for  general  n.  The 
limiting  behavior  of  the  solution  is  easily  extracted.  For  large 
W,  A  approaches  zero,  and  a  approaches  one  with  the  behavior 
[recall  that  p  =  l/(n  -  l)] 


*  ur- 


This  is  similar  to  one  of  the  boundary  solutions.  The  minimum 
of  J  in  this  limit  has  the  form 

J  =r  1-2-^—  A  +  ...  .  (48) 

fl  “  1 

Let  us  now  discuss  small  values  of  W.  Note  that  as  W 
decreases,  a  decreases.  The  value  of  W  where  s  vanishes  is 

IV(*  *  0)  -  1/n*  .  (49) 

For  values  of  W  smaller  than  this  value,  there  is  no  interior 
symmetric  solution.  At  this  critical  value,  S  =  0.  Finally,  note 
that  for  this  symmetric  solution,  T  —  U,  and  using  the  above 
equations, 

T  =  (n-  l)WaA"  ,  (50) 


which  is  positive  definite.  Therefore,  the  T  and  U  terms  do 
not  contribute  to  this  minimum  because  g  =  d  —  0. 

Using  the  equation  for  A,  we  find  at  the  minimum 

Jm(Int)  *  W  +  (I  -  A)2  -  W(l  -  A*)*  .  (51) 

This  is  smaller  than  the  minimum  arising  from  the  boundary. 

To  see  this,  study  the  difference  of  Eq.  (44)  and  Eq.  (51) 
for  sufficiently  large  W  (so  that  the  former  exists).  If  W  is 
eliminated  between  Eqs.  (46)  and  (43),  the  result  is 

*>  =  A(il-£)',  (52) 

which  shows  that  Ao  =  Ao(A)  >  A.  The  difference  becomes 
Jm(Bnd)  -  Jm(Int)  =  [1  -  (1  -  A)1]  -  (i)Aj- 

n-1  <53) 

|l-(l-A")*]-I_-iAo. 

For  large  IV  this  difference  approaches  zero  as  (n  -  l)/(n]W). 
For  all  values  of  nlV  larger  than  one  it  is  a  simple  matter  to 
show  that  it  is  positive  (a  numerical  proof  is  easiest). 

Some  sample  numerical  results  are: 

n  =  2  n  m  4 


n*W 

a 

J-W 

a 

J-W 

0-1 

0.00 

-0.000 

0.00 

-0.00 

1.0 

0.00 

-0.000 

0.00 

-0.00 

1.1 

0.120 

-0.0001 

0.082 

-0.0000 

1.2 

0.218 

-0.0009 

0.150 

-0.0001 

1.3 

0.299 

-0.0026 

0.210 

-0.0003 

1.4 

0.367 

-0.0054 

0.261 

-0.0007 

1.6 

0.475 

-0.0144 

0.346 

•0.0018 

1.8 

0.556 

-0.0278 

0.413 

-0.0036 

2.0 

0.618 

•0.0451 

0.467 

-0.0061 

3.0 

0.791 

-0.175 

0.636 

-a0260 

4.0 

0.866 

-0.348 

0.725 

-0.091 

6.0 

0.930 

-0.757 

0.815 

-0.131 

8.0 

0.957 

-1.203 

0.862 

-0.219 

10.0 

0.971 

-1.669 

0.890 

-0.315 

12.0 

0.979 

-2.144 

0.909 

-0.417 

16.0 

0.987 

-3.117 

0.933 

-0.629 

Recall  that  g  =  d  =  0  for  this  global  minimum.  Therefore, 
if  either  detector  signals  1,  one  should  make  the  choice  mi  for 
any  value  of  IV. 

Perhaps  it  is  more  understandable  to  present  this  data  in 
another  format: 


W 

B 

n  =  2 

J 

J/Jm., 

s 

n  =  4 

J 

J/J. 

0.25 

0.250 

0.250 

1.0 

0.725 

0.195 

.78 

0.4 

0.475 

0.386 

.97 

0.827 

0.252 

.63 

0.5 

0.618 

0.455 

.91 

0.862 

0.281 

.56 

0.75 

0.791 

0.575 

.77 

0.909 

0.333 

.44 

1.0 

0.866 

0.652 

.65 

0.933 

0.371 

.37 

1.5 

0.930 

0.743 

.74 

0.957 

0.423 

.42 

2.0 

0.957 

0.797 

.80 

0.969 

0.459 

.46 

2.5 

0.971 

0.831 

.83 

0.976 

0.486 

.49 

3.0 

0.979 

0.856 

.86 

0.980 

0.508 

.51 

4.0 

0.987 

0.883 

.88 

0.966 

0.541 

.54 

5.0 

0.992 

0.909 

.91 

0.989 

0.567 

.57 

6.0 

0.994 

0.923 

.92 

0.991 

0.586 

.59 

8.0 

0.997 

0.941 

.94 

0.994 

0.616 

.62 

10.0 

0.998 

0.952 

.95 

0.995 

0.639 

.64 

The  column  labeled  Jf  Jm»*  gives  the  ratio  of  the  minimum 
J  to  the  quantity  Jm*t  defined  in  (18).  Again,  for  this  global 
minimum,  g  =  d—  0. 

i.l  Global  Minimum 


As  a  cheek  that  the  symmetric  minimum  is  indeed  the 
global  minimum,  we  have  evaluated  J  throughout  the  allowed 
region  of  the  six  variables  g,d  and  a,  A,  6,2?.  We  could 


As 


Fig.  3.  The  allowed  region  F  for  the  parameter 
A  as  a  function  of  a  using  the  discrete  three-step 
model. 


a  detailed  explanation  of  the  critical  value  at  which  this  jump 
occurs.  The  third  model  studied  also  has  such  a  discontinuity, 
and  permitted  a  continuous  transition  to  the  state  of  complete 
information  (perfect  discrimination).  In  this  case,  the  cost  de¬ 
pends  on  the  degree  of  ambiguity  in  a  quadratic  manner. 

Finally,  in  all  cases,  we  found  that  the  best  cost  is  achieved 
with  a  symmetric  choice  of  parameters  for  the  individual  detec¬ 
tors.  We  do  not  yet  have  a  general  characterisation  of  response 
functions  for  which  this  is  always  the  case  regardless  of  costs 
and  prior  probabilities. 

The  problem  considered  here  is  not  only  of  theoretical  in¬ 
terest,  but  has  many  practical  applications  ranging  from  opti¬ 
mal  design  of  complex  particle  detector  systems  to  the  design 
of  seismic  and  warning  systems. 
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The  proof  that  the  above  minimum  is  indeed  a  global  min¬ 
imum  follows  simply  by  letting  a  and  6  deviate  from  the  above 
values  while  keeping  A  and  B  as  close  to  their  optimum  values 
as  allowed  by  the  constraint.  For  W  small  enough  we  have: 

a  =  b  -  1  —  Ao(l  + 1) 

A  ~  B  =  1  (00) 

(s  +  r  +  PU  =-w(i-  V)  • 

It  now  follows  for  any  positive  t  that 

(5+r+i/)-(5+r+u)mi»  *  w  Ao*[(i+0*-i]>o.  (ei) 

For  a  and  6  larger  than  their  optimum  values,  the  constraints 
on  A  and  B  come  into  play  and 

«  =  6  =  l-Ao(l-e) 

A  =  B  =  1  -  Ai<  ^  * 

and  we  find 

(s+r  +  u)-(s  +  t  +  u)mm  = 

2«>i(l  -  A,)  +  (Aj*  -  W A0J)  (1  -  (X  -  a)2]  (63) 

>  0 

if  (Aj1  -  W Ao3)  is  positive  (and  if  c  is  positive,  of  course). 

When  W  grows  so  that  (Aj*  —  W A^1)  becomes  negative, 
one  should  repeat  the  above  procedure  around  the  values  a  = 
6=1  and  A  =  B  —  1  -  Xj  to  prove  the  global  nature  of  the 
minimum  in  this  region.  Alternatively,  one  may  argue  that  the 
feasible  region  is  defined,  in  this  case,  by  hyperplanes,  so  that 
the  minimum  must  occur  at  a  vertex,  as  given  above. 

7.  Summary 

We  find  that  the  problem  of  optimal  design,  with  fusion 
and  detector  tuning,  is  difficult  but  tractable.  Our  simple  ex¬ 
amples  yield  some  insight  into  how  the  best  achievable  cost 
varies  between  its  bounds  and  how  that  best  cost  depends  on 
the  prior  distribution  and  the  cost  function  itself. 

By  utilising  the  technique  of  invariant  imbedding,  that  is 
by  considering  a  general  class  of  response  functions  that  con¬ 
tain  the  exponential  response  model  as  a  particular  case,  we 
can  trace  a  discontinuous  change  in  design  parameters,  even 
though  the  optimum  cost  >  aries  smoothly.  We  cannot  yet  give 
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APPENDIX  B 


REVIEW  OF  SOME  RELATED  RESULTS 


Tenney  and  Sandeil  [1981]  discussed  team  action  in  the  case  of  binary 
hypotheses  and  binary  actions  for  two  stations.  They  selected  a  particular  form  for 
the  cost  function,  established  that  the  optimum  signalling/action  rule  is  a  likelihood 
ratio  test,  and  presented  coupled  equations  determining  the  likelihood  ratio  thresholds. 
Numerical  examples  are  given,  showing  the  complexity  of  the  problem,  for  several 
choices  of  the  detector  response  functions  f(ylh).  The  cost  function  is  described  by  a 
single  parameter,  the  cost  of  having  both  stations  miss.  Of  course,  as  this  parameter 
becomes  large,  the  optimal  solution  is  to  have  both  stations  act  oppositely,  no  matter 
what  the  signal. 

Ekchian  and  Tenney  [1982]  extend  the  analysis  to  more  general  topologies. 
They  note  that  the  number  of  regions  into  which  each  station  divides  the  space  of 
received  signals  depends  on  how  many  signals  it  may  emit,  and  the  number  of  such 
division  rules  (which  they  call  thresholds,  for  the  binary  case)  is  equal  to  the  total 
number  of  signal  combinations  that  can  be  received  from  all  the  other  stations.  They 
propose  that  an  extension  of  the  dynamic  programming  concept  can  be  applied  to  the 
analysis  of  tree-like  topologies. 

Chair  and  Varshney  [1986]  turn  to  the  fusion  question,  and  discuss  the  optimum 
fusion  rule,  for  a  binary  hypothesis-binary  action  situation,  with  binary  signals. 
Essentially  they  compute  the  updated  conditional  probability  that  either  hypothesis  is 
true,  given  the  signals  from  the  stations,  and  their  known  response  characteristics. 
Although  there  are  some  notational  problems  with  the  paper,  the  result  is  correct. 

Kushner  and  Pacut  [1982]  opened  the  discussion  of  the  remeasurement  problem, 
which  is  the  simplest  example  of  what  we  call  the  "call-back"  problem.  There  are  two 
stations,  which  can  communicate  with  each  other,  in  either  direction.  At  each  station, 
remeasurement,  can  reduce  the  probability  of  error,  but  it  has  some  cost.  They 
formulate  the  ..problem  of  deciding  when  it  is  appropriate,  given  the  signal  from  the 
other  station,  to  remeasure  before  proceeding  to  action.  The  capacity  of  the 

information  channel  is  not  clearly  defined,  as  transmission  of  the  full  posterior 

probability,  which  they  assume,  might  be  as  costly  as  transmission  of  the  full  signal  (y) 
received.  Again,  the  computations  are  quite  complex,  even  for  the  simple  exponential 
response  function. 

Srinivasan  [1986]  provided  explicit  formulas  for  the  relation  among  system 
operating  characteristics  and  those  of  the  distributed  sensors,  and  noted  that  the 
optimal  tuning  for  the  sensors  depends  on  the  choice  of  the  fusion  rule.  Performance 
characteristics  are  given  for  systems  of  2  and  three  detectors  with  slow  Raleigh 

fading  (equivalent  to  the  case  of  exponential  distributions  of  signals),  and  it  is  noted 
that  the  optimal  rule  for  2  detectors  is  BOTH  (called  "AND").  Srinivasan,  Sharma  and 
Malik  [1986]  apply  the  methods  to  2,  3  and  4  detectors  for  the  case  of  Sweriing 

targets,  and  provide  (semi-log)  plots  of  the  optimal  performance  jof  those  systems.  They 
note  that  the  performance  comes  quite  close  to  that  of  a  one-sensor  system  receiving 
the  same  amount  of  information.  Due  to  some  technical  assumptions,  the  plots  that 


they  present  for  3  and  four  detector  systems  do  not  make  this  clear.  [They  assume 
that  repeat  pulses  of  the  same  amplitude  are  received  at  the  single  site.] 

Sadjadi  [1986]  has  extended  the  discussion  to  any  number  of  stations  and 
hypotheses,  with  the  number  of  actions  equal  to  the  number  of  hypotheses.  The  ideal 
decision  rules  would  fuse  the  data  from  all  stations,  to  arrive  at  a  grand  updated 
probability  estimate,  but  each  station  must  act  in  ignorance  of  the  others.  Thus  the 

•pace  of  received  signals  (y.,....^^)  is  sliced  by  hyperplanes  parallel  to  the  axes, 

representing  the  thresholds  Tor  the  individual  stations.  Each  station  has  as  many 

thresholds  (defined  in  terms  of  the  likelihood  ratios)  as  are  needed  to  label  the 

hypotheses.  That  is,  for  any  signal  received,  one  and  only  one  of  the  hypotheses  will 
have  a  posterior  probability  larger  than  any  of  the  other  hypotheses.  The  effect  of 
the  other  stations  is  indirect,  through  the  fact  that  they  all  know  the  form  o.f  the 
common  cost  function. 

C.iu  [1987]  has  discussed  a  related  problem,  where  he  shows  that  a  nominator- 
detector  scheme  can  lower  computational  costs,  by  screening  unlikely  candidates  using 
a  low  cost  test  ["sensor"].  He  assumes,  however,  that,  for  cases  which  are  not  rejects, 
the  nominator  sensor  passes  full  information  to  the  second  detector. 

Schwartz  and  Pelkowitz  [1987]  have  considered  the  problem  of  minimizing  the 
time  to  reach  a  decision,  for  a  given  fixed  False  Alarm  Rate  (FAR).  This  work  is  not 
directly  relevant  to  what  we  have  studied  up  until  now,  but  will  be  relevant  to  the 
question  of  call-back  strategies,  discussed  below. 

Reibman  and  Nolte  [1987]  have  done  some  model  calculations  for  a  three 
detector  system,  using  shifted  generalized  Gaussian  distributions  to  represent  the 
sensor  response  to  the  alternative  hypotheses  (that  is,  constant  signal  in  generalized 
Gaussian  noise).  They  correctly  establish  that  the  system  in  which  the  fusion  rule  and 
the  sensor  tunings  are  jointly  optimized  is  superior  to  any  system  in  which  either  of 
those  is  fixed  a  priori.  They  do  not  solve  the  optimization  problem  directly,  as  we  do, 
but  use  coupled  equations  which  must  hold  at  the  optimum.  They  find  that  the  optimal 
tuning  for  the  cases  considered  is  symmetric,  but  do  not  know  whether  that  is  a 
general  rule  or  an  accident.  [We  have  established  (see  below)  that  it  is  NOT  a  general 
rule] 

Additional  recent  works  are  cited  in  the  references. 

F.4  Our  Own  Related  Work 
F.4.1.  Experimental  Design 

Our  own  thinking  is  strongly  influenced  by  the  work  of  the  statistician 
Blackwell  [1957]  on  the  optimal  design  of  experiments.  The  analogy  is  clear.  An 
optimal  experiment  minimizes  the  chance  of  mistaking  the  hypothesis.  An  optimal 
detection  system  minimizes  the  "cost"  of  mislabelling  the  state  of  the  world.  Blackwell 
was  able  to  show,  by  a  sophisticated  application  of  the  theory  of  two-person  zero-sum 
games,  that  it  is  possible  for  one  experimental  design  to  completely  dominate  another. 
That  is,  no  ma^wer  what  the  cost  function,  and  no  matter  what  the  prior  probabilities, 
the  expected  cost  of  following  the  better  design  is  lower  than  the  expected  cost  of 
following  the  other. 


This  is  an  enormously  important  result,  because,  in  real  applications,  estimates 
of  the  cost  function  and  the  prior  probabilities  are  little  more  than  political  and 
social  guesswork.  Hence,  the  possibility  of  establishing  the  superiority  of  a  design, 
independent  of  the  costs  and  probabilities,  is  extremely  valuable.  Blackwell's  result  is 
given  in  terms  of  a  necessary  and  sufficient  condition  on  the  convex  hull  of  the 
vectors  representing  the  joint  probabilities  of  actions  and  hypotheses,  when  the  prior 
probabilities  are  all  equal. 

In  our  own  preliminary  work  the  ideas  of  Blackwell  have  a  direct 
interpretation.  We  determine  the  contour  of  possible  values  of  (F,M),  given  the 
response  functions  f(y|h).  This  contour  is  exactly  the  envelope  of  the  various  convex 
hulls  defined  by  various  choices  of  threshold.  In  particular,  Blackwell's  theorem  states 
that  any  operating  point  (F,M)  in  the  interior  of  the  allowed  region,  is  dominated  by  a 
point  on  the  boundary  of  the  region.  [We  had  obtained  this  result  directly,  using  the 
linearity  of  the  cost  function.]  Further,  from  the  necessity  part  of  Blackwell's 
theorem,  it  follows  that  none  of  the  operating  points  on  the  boundary  of  the  allowed 
region  completely  dominates  any  of  the  other  points  on  that  boundary.  It  may,  further, 
be  shown  that  any  given  choice  of  the  cost  matrix  determines  an  operating  point,  or 
line,  on  the  conxev  hull  of  that  boundary. 

In  the  proposed  work  we  will  develop  these  relations  fully,  with  particular 
attention  to  two  situations  in  which  it  seems  likely  that  dominance  may  occur: 

*  comparison  of  topologies  and  signal  sets 

*  evaluation  of  call-back  strategies. 

In  a  call-back  strategy,  each  station  has  the  option  of  either  proceeding  to 
signal/act,  or  calling  back  to  one  or  more  of  those  stations  from  which  it  receives 
signals,  to  ask  for  a  more  detailed  report.  It  seems  likely  that  the  optimal 
development  of  call-back  strategies  will  involve  the  maximum  entropy  principle.  In 
particular,  each  station  will  form  estimates  of  what  the  others  are  likely  to  say,  based 
on  assumptions  of  randomness,  but  subject  to  the  signals  they  have  already 
transmitted.  The  MEP  is  the  best  known  technique  for  implementing  this  concept  of 
constrained  randomness.  It  has  been  applied  to  the  analogous  problem  of  information 
retrieval  in  databases,  by  Kantor  [1985]  and  by  Kantor  and  Lee  [1986]. 

F.4.2  Related  work  on  the  Maximum  Entropy  Principle 

The  Maximum  Entropy  Principle  (MEP)  is  a  mathematical  technique  [Jaynes 
1957a, 1957b]  for  making  and/or  facilitating  decisions  in  the  presence  of  probabilistic 
information  and  constraints.  [Smith,  ed.  1982,83,84,85,86]  A  priori  probabilities  or  prior 
information  can  also  be  included  [Johnson  and  Shore  1983].  Applications  of  the  MEP 
approach  include  such  abstract  topics  as  "good  null  hypothesis"  in  statistics, 
computerized  "expert  systems"  [Cheeseman  1983],  the  processing  of  seismic  data  [Burg 
1975],  the  inversion  of  data  in  geologic  problems  [Rietsch  1977],  the  enhancing  of 
blurred  photographs  for  astronomical  and  law  enforcement  uses  [Gull  and  Daniel  1978], 
and  finally  to  the  construction  of  the  quantum  mechanical  density  matrix  from  realistic 
(non-ideal)  data  [Blankenbecler  and  Partovi  1985]. 

One  of  us  has  explored  the  application  of  the  MEP  to  the  problem  of 
retrieving  "relevant  documents"  from  a  very  large  data  base  [Kantor  1984].  Every 
document  may  be  described  by  one  or  more  descriptors  ("keywords")  while  constraints 
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take  the  form  of  "probabilities  of  relevance"  or  "expected  values."  For  example,  the 
system  may  be  told  that  "keyword  A"  carries  a  70%  chance  of  relevance,  and  so  on. 
The  system  then  gathers  information  about  the  co-occurrence  of  various  combinations 
of  keywords,  such  as  "A  and  B  but  not  C."  The  MEP  uniquely  determines  the  chance 
of  relevance  for  each  such  combination,  given  the  constraints  on  co-occurrence,  and 
the  prior  estimates  of  properties  as  described  above. 

We  have  established  that  this  information  is  sufficient  to  optimize  the  data 
retrieval  process  when  costs  are  measured  by  any  reasonable  combination  of  man  and 
machine  time,  and  have  given  a  general  procedure  for  dealing  with  the  problem  of 
inconsistent  prior  estimates  [Kantor  and  Lee  1986,  1987a]. 

The  most  important  features  of  the  MEP  approach  are  the  following: 

1.  Probabilistic  information  and  constraint  are  accepted  at  all  stages. 

2.  The  alternatives  among  which  a  choice  is  to  be  made  are  presented  in  a 
well-determined  rank  order. 

3.  The  calculations  are  completed  in  "real  time." 

4.  The  output  of  this  procedure,  a  rank-ordering,  could  serve  as  input  for 

another  level  of  decision  making,  i.e.  formulating  appropriate  action. 

These  characteristics  justify  examination  of  the  potential  of  the  MEP  as  a 
decision  tool  in  a  more  general  context.  It  is  clearly  important  for  any  logic  or 
decision  making  system  to  be  able  to  accept  probabilistic  information  and  rank-order 
alternatives  in  real  time. 


The  relevance  of  the  MEP  to  the  DS5  problem  goes  beyond  these 
generalities.  The  detailed  operation  of  a  distributed  system  should  allow  for  two-way 
messages  between  the  various  stations.  The  communication  structure  can  be 

represented  as  a  matrix,  whose  rows  and  columns  are  labeled  by  the  various  stations. 
The  entry  at  the  intersection  of  Row  I  and  column  J  describes  the  set  of  signals  that 

Station  J  may  send  to  Station  I.  This  description  includes  an  enumeration  of  the 

signals  -  "1",  "2",  and  so  on,  and  a  specification  of  the  corresponding  regions  in  the 
y-space  which  trigger  those  various  signals,  depending,  as  we  have  noted,  on  the 
signals  input  to  Station  J  from  its  neighbors. 

In  published  work,  the  signals  flow  only  one  way.  In  particular,  there  are  no 
"interrogatory  signals"  by  which  station  J  may  ask  Station  1  to  elaborate  on  its  report. 
Such  elaboration  is  clearly  possible  if,  for  example,  a  continuous  variable  (y)  has  been 
replaced  by  a  simple  discrete  signal  u  selected  from  U(J,I).  The  decision  to  request  an 
elaboration  is  exactly  parallel  to  the  decision  to  "request  a  document."  It  must  be 
based  on  an  estimation  of  the  probability  that  the  elaborated  signal  will  improve 

decision  making  sufficiently  to  offset  the  added  cost  and  delay.  The  work  by  Kushner 
and  Pacut  cited  above  is  similar  in  spirit,  but  considered  only  remeasurement  at  a 
single  station. 

Thus,  it  seems  likely  that  the  optimizing  characteristics  of  an  MEP  rank 
ordering  will  carry  over  to  the  DS5  arena.  We  plan  to  study  this  aspect  of  the 
problem  after  detailed  examination  of  a  variety  of  optimal  control  models. 
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APPENDIX  C 
2.  Project  Goals 

The  anticipated  payoff  of  both  stage  one  and  stage  two  of  the  research  is: 

1.  a  simplification  in  the  design  and  comparison  of  distributed  sensor  systems 

2.  an  improvement  in  the  operational  effectiveness  of  such  systems 

3.  a  major  reduction  in  programming  complexity,  leading  to  higher  reliability 
at  lower  cost. 

The  detailed  goal  is  to  examine  the  optimal  design  and  decision  strategy  to  be 
used  with  distributed  sensors.  Particular  attention  will  be  given  to  the  notion  of 
a  dominant  system  design  (in  the  sense  of  Blackwell),  to  the  non-linear  problems 
of  optimal  parameter  (threshold)  determination,  and  finally  to  the  potential  of 
the  Maximum  Entropy  Principle  as  an  optimizing  tool. 


3.  Proposed  Research 


General  observations. 


Every  sensor  of  a  distributed  network  receives  a  vector  of  signals  (y)  and 
processes  them  to  produce  a  much  simpler  vector  of  signals  (u)  ,  whose  compo¬ 
nents  may  be  binary  data  or  other  summary  data  (such  as  velocity  and  position 
estimates).  These  summaries  are  not  to  be  considered  as  only  facts  about  the 
situation  observed  by  the  sensor.  They  are  also  ‘generalized  keywords’  describing 
the  full  set  of  information  (y)  available  at  the  sensor. 

Any  communicating  station  may  evaluate  those  keywords  (u) ,  in  the  light  of 
other  information  available  to  it,  and  decide  to  request  a  more  detailed  report 
about  the  original  data  (y) .  In  this  way,  a  sophisticated  problem  of  optimal 
information  retrieval  is  imbedded  in  the  problem  of  designing  a  distributed  sensor 
system. 

The  planned  research  involves  three  major  areas,  all  bearing  on  the  central 
problem  of  optimal  design  of  distributed  sensor  systems.  They  are  outlined  below 
(in  subsections  3. 1,3. 2, and  3.3). 
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3.1  Detailed  analysis  of  model  problems. 

The  invariant  imbedding  technique  is  a  powerful  tool  for  determining  how 
the  properties  found  in  model  problems  depend  on  the  structure  of  the  problem. 
Invariant  imbedding  is  illustrated  in  the  attached  paper  by  Blankenbecler  and 
Kantor.  In  this  particular  application  we  have  used  it  to  elucidate  the  effect  of 
ambiguous  signals  on  overall  system  performance.  We  are  able  to  continously 
vary  from  a  problem  where  the  report  from  each  detector  is  unambiguous,  to  one 
where  they  are  ambiguous  to  any  desired  degree. 

In  our  analysis  so  far  we  find  that  (numerically,  and  in  some  models,  ana¬ 
lytically)  the  optimum  operating  point  is  the  same  for  both  of  the  detectors  in 
a  simple  fusion  system.  It  remains  to  establish  whether  this  symmetry  between 
the  detectors  is  universally  valid  and,  if  so,  why. 

It  also  seems  intuitively  clear  that  increasing  the  size  of  the  signal  set  should 
improve  system  performance.  We  will  use  models  of  the  type  already  discussed  to 
see  whether  this  intuitive  relation  is  supported  by  Blackwell’s  theorem.  This  will 
also  lay  the  foundations  for  considering  the  trade-offs  between  communication 
costs  and  system  performance. 

Representative  problems  are: 

1.  What  is  the  corresponding  full  solution  for  3  identical  detectors?  Is  the 
best  rule  for  integration  of  signals  always  majority  rule? 

2.  In  the  two  detector  case,  what  improvements  result  when  the  detectors  are 
not  identical.  How  do  the  thresholds  vary? 

3.2  Network  topology  and  Blackwell  dominance. 

Networks  may  be  described  by  the  matrices  of  possible  signals,  mentioned 
above,  and  the  signal/action  rules  by  which  each  station  selects  a  signal  or  action. 
The  most  general  station  receives  input  from  the  world  at  large,  receives  signals 
from  one  or  more  other  stations,  emits  signals  to  one  or  more  other  stations,  and 
may  also  take  action.  The  possibilities  may  be  represented  as  follows: 
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means  A  sends  a  signal  to  B 
indicates  that  A  receives  external  information 
means  that  C  takes  action 
means  that  B  signals  to  A  and 
A  can  request  elaboration  from  B  . 

With  this  notation  the  problem  of  Tenney  and  Sandell  is: 

A  B 

There  is  no  overt  communication. 

The  problem  discussed  by  Kantor  and  Blankenbecler  [attached]  is: 

A  — >C  < — B 

All  other  models  summarized  above  can  be  similarly  represented. 

A  typical  question  about  topology  is  to  compare: 

System  I  A  — >C  * — B 

System  II  A  — >B  — >C 

In  system  II,  C  is  masked  from  A  by  B  (unless  B  uses  an  enlarged  signal 
set.)  Intuitively,  therefore,  the  performance  of  System  II  should  never  be  better 
than  that  of  System  I.  Blackwell’s  Theorem  establishes  a  necessary  and  sufficient 
condition  on  the  Blackwell  vectors,  if  this  dominance  prevails.  Model  calculations 
and  imbedding  techniques  will  be  applied  to  explore  this  relationship. 

3.3  Call-back  and  information-seeking  structures. 

The  work  by  Kushner  and  Pacut,  described  above,  assumed  that  each  station 
transmits  a  sufficient  statistic,  (its  own  estimate  of  the  posterior  probability).  In 
the  other  models,  a  bare  minimum  signal  is  usually  assumed.  The  situation  has 
some  conceptual  similarity  to  the  problem  of  sample  size  in  quality  control.  If 
every  item  is  examined,  one  gets  the  best  possible  result,  but  the  cost  is  too 
high;  sampling  is  therefore  used.  Furthermore  it  is  well  known  that  fixed  block 
sampling  does  not  perform  as  well  as  sequential  sampling,  in  which  the  decision 
to  sample  further  is  based  (in  a  Bayesian  analysis)  on  the  data  obtained  to  date. 


Given  this  analogy,  we  expect  to  be  able  to  show  that  call  back  systems 
dominate  non-call-back  systems  with  the  same  size  signal  sets,  and  are  more 
efficient  than  systems  in  which  all  the  information  obtainable  by  call-back  is 
transmitted  whether  or  not  ’t  is  requested. 

Example  systems  for  this  problem  are: 

System  III  A  — >B  versus 

System  IV  A  -<=B 

The  Maximum  Entropy  Principle  is  a  powerful  technique  for  optimal  retrieval 
of  information,  and  we  expect  that  it  will  apply  to  distributed  sensors.  To  apply 
it  we  must  consider: 

1.  What  are  the  effective  probabilistic  constraints  implied  by  available  infor¬ 
mation  on  the  physical  characteristics  of  the  threat  environment? 

2.  What  is  the  computational  complexity  of  the  optimization  problem,  and 
can  it  be  solved  in  real  time,  given  state-of-the-art  computational  power? 
Does  it  admit  a  high  degree  of  parallel  processing,  especially  for  multiple 
targets? 
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